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Filing Language: English (EN) 

fflGH THROUGHPUT OR CAPILLARY-BASED SCREENING FOR A BIOACTIVITY OR 
BIOMOLECULE FIELD OF THE INVENTION [0001] The present invention relates generally to methods 
and systems for screening of populations of organisms. In one aspect, the methods and systems are used to 
identify bioactive molecules and bioactivities using, e. g. , screening techniques such as high throughput 
screening and capillary array platforms. The invention may be used with mixed or cultured populations of 
organisms. The invention provides a culture-independent approach to directly clone genes encoding novel 
enzymes from environmental samples containing a mixed population of organisms. The invention provides 
a novel high throughput cultivation method based on the combination of a single cell encapsulation 
procedure with flow cytometry that enables cells to grow with nutrients that are present at environmental 
concentrations. 

BACKGROUND OF THE INVENTION [0002] There is a critical need in the chemical industry for 
efficient catalysts for the practical synthesis of optically pure materials; enzymes can provide the optimal 
solution. 

All classes of molecules and compoimds that are utilized in both established and emerging chemical, 
pharmaceutical, textile, food and feed, detergent markets must meet stringent economical and 
environmental standards. The synthesis of polymers, pharmaceuticals, natural products and agrochemicals 
is often hampered by expensive processes which produce harmfiil byproducts and which suffer from low 
enantioselectivity (Faber, 1995 ; Tonkovich and Gerber, U. S. Dept of Energy study, 1995). Enzymes have 
a number of remarkable advantages which can overcome these problems in catalysis: they act on single 
fimctional groups, they distinguish between similar fiinctional groups on a single molecule, and they 
distinguish between enantiomers. Moreover, they are biodegradable and function at very low mole 
fractions in reaction mixtures. Because of their chemo-, regio-and stereospecificity, enzymes present a 
unique opportunity to optimally achieve desired selective transformations. These are often extremely 
difficult to duplicate chemically, especially in single-step reactions. The elimination of the need for 
protection groups, selectivity, the ability to carry out multi-step transformations in a single reaction vessel, 
along with the concomitant reduction in environmental burden, has led to the increased demand for 
enzymes in chemical and pharmaceutical industries (Faber, 1995). Enzyme- based processes have been 
gradually replacing many conventional chemical-based methods (Wrotnowski, 1997). A current limitation 
to more widespread industrial use is primarily due to the relatively small number of commercially available 
enzymes. Only-300 enzymes (excluding DNA modifying enzymes) are at present commercially available 
from the > 3000 non DNA-modifying enzyme activities thus far described. 

[0003] The use of enzymes for technological applications also may requu-e performance xmder demanding 
industrial conditions. This includes activities in environments or on substrates for which the currently 
known arsenal of enzymes was not evolutionarily selected. 

Enzymes have evolved by selective pressure to perform very specific biological fimctions within the milieu 
of a living organism, under conditions of mild temperature, pH and sah concentration. For the most part, 
the non-DNA modifying enzyme activities thus far described (Enzyme Nomenclature, 1992) have been 
isolated from mesophilic organisms, which represent a very small fraction of the available phylogenetic 
diversity (Amann et al., 1995). The dynamic field of biocatalysis takes on a new dimension with the help of 
enzymes isolated from microorganisms that thrive in extreme environments. Such enzymes must function 
at temperatures above 100°C in terrestrial hot springs and deep sea thermal vents, at temperatures below 0° 
C in arctic waters, in the saturated salt environment of the Dead Sea, at pH values around 0 in coal deposits 
and geothermal sulfixr-rich springs, or at pH values greater than 1 1 in sewage sludge (Adams and Kelly, 
1995). The enzymes may also be obtained from: geothermal and hydrothermal fields, acidic soils, sulfotara 
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and boiling mud pots, pools, hot-springs and geysers where the enzymes are neutral to alkaline, marine 
actinomycetes, metazoan, endo and ectosymbionts, tropical soil, temperate soil, arid soil, compost piles, 
manure piles, marine sediments, freshwater sediments, water concentrates, hypersaline and super-cooled 
sea ice, arctic tundra. Sargasso sea, open ocean pelagic, marine snow, microbial mats (such as whale falls, 
springs and hydrothermal vents), gut microbial commimities (e. g. , from insects, nematodes, etc. ), plant 
endophytes, epiphytic water samples, industrial sites and ex situ enrichments. Additionally, the enzymes 
may be isolated from eukaryotes, prokaryotes, myxobacteria (epothilone), air, water, sediment, soil or rock. 
Enzymes obtained from these extremophilic organisms open a new field in biocatalysis. 

[0004] For example, several esterases and lipases cloned and expressed from extremophilic organisms are 
remarkably robust, showing high activity throughout a wide range of temperatures and pHs. The 
fingerprints of several of these esterases show a diverse substrate spectrum, in addition to differences in the 
optimum reaction temperature. Certain esterases recognize only short chain substrates while others only 
acts on long chain substrates in addition to a huge difference in the optimal reaction temperature. These 
results demonstrate that more diverse enzymes fulfilling the need for new biocatalysts can be found by 
screening biodiversity. Substrates upon which enzymes act are herein defined as bioactive substrates. 

[0005] Furthermore, virtually all of the enzymes known so far have come from cultured organisms, mostly 
bacteria and more recently archaea (Enzyme Nomenclature, 1992). 

Traditional enzyme discovery programs rely solely on cultured microorganisms for their screening 
programs and are thus only accessing a small fraction of natural diversity. Several recent studies have 
estimated that only a small percentage, conservatively less than 1%, of organisms present in the natural 
environment have been cultured (see Table I, Amann et al., 1995, Bams et. al 1994, Torvsik, 1990). For 
example, Norman Pace's laboratory recently reported intensive untapped diversity in water and sediment 
samples from the"Obsidian PooP'in Yellowstone National Park, a spring which has been studied since the 
early 1960's by microbiologists (Bams, 1994). AmpUfication and cloning of 16S rRNA encoding sequences 
revealed mostly unique sequences with little or no representation of the organisms which had previously 
been cultured from this pool. This demonstrates substantial diversity of archaea with so far unlmown 
morphological, physiological and biochemical features which may be useful in industrial processes. David 
Ward's laboratory in Bozmen, Montana has performed similar studies on the cyanobacterial mat of Octopus 
Spring in Yellowstone Park and came to the same conclusion, namely, tremendous uncultured diversity 
exists (Bateson et al. , 1989). Giovannoni et al. (1990) reported similar results using bacterioplankton 
collected in the Sargasso Sea while Torsvik et al. (1990) have shown by DNA reassociation kinetics that 
there is considerable diversity in soil samples. Hence, this vast majority of microorganisms represent an 
untapped resource for the discovery of novel biocatalysts. In order to access this potential catalytic 
diversity, recombinant screening approaches are required. 

[0006] Bacteria and many eukaryotes have a coordinated mechanism for regulating genes whose products 
are involved in related processes. The genes are clustered, in stmctures referred to as"gene clusters, "on a 
single chromosome and are transcribed together under the control of a single regulatory sequence, 
including a single promoter which initiates transcription of the entire cluster. The gene cluster, the 
promoter, and additional sequences that function in regulation altogether are referred to as an"operon"and 
can include up to 30 or more genes, usually from 2 to 6 genes. Thus, a gene cluster is a group of adjacent 
genes that are either identical or related, usually as to their function. 

[0007] Some gene families consist of one or more identical members. Clustering is a prerequisite for 
maintaining identity between genes, although clustered genes are not necessarily identical. Gene clusters 
range from extremes where a duplication is generated of adjacent related genes to cases where hundreds of 
identical genes lie in a tandem array. 
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Sometimes no significance is discemable in a repetition of a particular gene. A principal example of this is 
the expressed duplicate insulin genes in some species, whereas a single insulin gene is adequate in other 
mammalian species, 

[0008] It is important to further research gene clusters and the extent to which the full length of the cluster 
is necessary for the expression of the proteins resulting therefrom. Gene clusters undergo continual 
reorganization and, thus, the ability to create heterogeneous libraries of gene clusters from, for example, 
bacterial or other prokaryote sources is valuable in determining sources of novel proteins, particularly 
including enzymes such as, for example, the polyketide synthases that are responsible for the synthesis of 
polyketides having a vast array of useful activities. As indicated, other types of proteins and molecules that 
are the product (s) of gene clusters are also contemplated, including, for example, antibiotics, antivirals, 
antitumor agents and regulatory proteins, such as insulin. 

[0009] Polyketides are molecules which are an extremely rich source of bioactivities, including antibiotics 
(such as tetracyclines and erythromycin), anti-cancer agents (daunomycin), immunosuppressants (FK506 
and rapamycin), and veterinary products (monensin). Many polyketides (produced by polyketide synthases) 
are valuable as therapeutic agents. Polyketide synthases are multifunctional enzymes that catalyze the 
biosynthesis of a huge variety of carbon chains differing in length and pattems of functionality and 
cyclization. Polyketide synthase genes fall into gene clusters and at least one type (designated type I) of 
polyketide synthases have large size genes and encoded enzymes, complicating genetic manipulation and in 
vitro studies of these genes/proteins. 

[00010] Gene libraries of microorganisms have been prepared for the purpose of identifying genes involved 
in biosynthetic pathways that produce medicinally-active metabolites and specialty chemicals. These 
pathways require multiple proteins (specifically, enzymes), entailing greater complexity than the single 
proteins used as drug targets. For example, genes encoding pathways of bacterial polyketide synthases 
(PKSs) were identified by screening gene libraries of the organism (Malpartida et al. 1984, Nature 309: 
462 ; Donadio et al. 1991, Science 252: 675-679). PKSs catalyze multiple steps of the biosynthesis of 
polyketides, an important class of therapeutic compounds, and control the structural diversity of the 
polyketides produced. A host- vector system in Streptomyces has been developed that allows directed 
mutation and expression of cloned PKS genes (McDaniel et al. 1993, Science 262: 1546-1550 ; Kao et al, 
1994, Science 265: 509-512), This specific host- vector system has been used to develop more efficient 
ways of producing polyketides, and to rationally develop novel polyketides (Khosla et al. , WO 95/08548). 

[0001 1] Another example is the production of the textile dye, indigo, by fermentation in an E. coli host. 
Two operons containing the genes that encode the multienzyme biosynthetic pathway have been 
genetically manipulated to improve production of indigo by the foreign E. coli host (see, e. g. , Ensley et al. 
1983, Science 222: 167-169 ; Murdocketal. 1993, Bib/Technology 11: 381-386). Overall, conventional 
studies of heterologous expression of genes encoding a metabolic pathway involve directed cloning, 
sequence analysis, designed mutations, and rearrangement of specific genes that encode proteins known to 
be involved in previously characterized metabolic pathways. 

[00012] In view of numerous advances in the understanding of disease mechanisms and identification of 
drug targets, there is an increasing need for innovative strategies and methods for rapidly identifying lead 
compounds and chaimeling them toward clinical testing. 

[00013] Of particular interest are cellular" switches"known as receptors which interact with a variety of 
biomolecules, such as hormones, growth factors, and neurotransmitters, to mediate the transduction of 
an"extemar'cellular signaling event into an"intemar'cellular signal. External signaling events include the 
binding of a ligand to the receptor, and internal events include the modulation of a pathway in the 
cytoplasm or nucleus involved in the growth, metabolism or apoptosis of the cell. Intemal events also 
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include the inhibition or activation of transcription of certain nucleic acid sequences, resulting in the 
increase or decrease in the production or presence of certain molecules (such as nucleic acid, proteins, 
and/or other molecules affected by this increase or decrease in transcription). Drugs to cure disease or 
alleviate its symptoms can activate or block any of these events to achieve a desired pharmaceutical effect. 

[00014] Transduction can be accomplished by a transducing protein in the cell membrane which is 
activated upon an allosteric change the receptor may undergo upon binding to a specific biomolecule. 
The"active"transducing protein activates production of so-called "second messenger"molecules within the 
cell, which then activate certain regulatory proteins within the cell that regulate gene expression or alter 
some metabolic process. 

Variations on the theme of this"cascade"of events occur. For example, a receptor may act as its own 
transducing protein, or a transducing protein may act directly on an intracellular target without mediation 
by a second messenger. 

[00015] Signal transduction is a fundamental area of inquiry in biology. For instance, ligand/receptor 
interactions and the receptor/effector coupling mediated by Guanine nucleotide-binding proteins (G- 
proteins) are of interest in the study of disease. A large number of G protein-linked receptors funnel 
extracellular signals as diverse as hormones, growth factors, neurotransmitters, primary sensory stimuli, 
and other signals through a set of G proteins to a small number of second-messenger systems. The G 
proteins act as molecular switches with an"on"and"off 'state governed by a GTPase cycle. Mutations in G 
proteins may result in either constitutive activation or loss of expression mutations. 

[00016] Many receptors convey messages through heterotrimeric G proteins, of which at least 17 distinct 
forms have been isolated. Additionally, there are several different G protein-dependent effectors. The 
signals transduced through the heterotrimeric G proteins in mammalian cells influence intracellular events 
through the action of effector molecules. 

[00017] Given the variety of functions subserved by G protein-coupled signal transduction, it is not 
surprising that abnormalities in G protein-coupled pathways can lead to diseases with manifestations as 
dissimilar as blindness, hormone resistance, precocious puberty and neoplasia. G-protein-coupled receptors 
are extremely important to drug research efforts. It is estimated that up to 60% of today's prescription drugs 
work by somehow interacting with G protein-coupled receptors. However, these drugs were developed 
using classical medicinal chemistry and without a knowledge of the molecular mechanism of action. A 
more efficient drug discovery program could be deployed by targeting individual receptors and making use 
of information on gene sequence and biological function to develop effective therapeutics. 

[00018] Several groups have reported cells which express mammalian G proteins or subunits thereof, along 
with mammalian receptors which interact with these molecules. For example, W092/05244 (April 2,1992) 
describes a transformed yeast cell which is incapable of producing a yeast G protein subunit, but which has 
been engineered to produce both a mammalian G protein subunit and a mammalian receptor which interacts 
with the subunit. 

The authors found that a modified version of a specific mammalian receptor integrated into the membrane 
of the cell, as shown by studies of the ability of isolated membranes to interact properly with various 
known agonists and antagonists of the receptor. Ligand binding resulted in G protein-mediated signal 
transduction. . 

[00019] Another group has described the functional expression of a mammalian adenylyl cyclase in yeast, 
and the use of the engineered yeast cells in identifying potential inhibitors or activators of tiie mammalian 
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adenylyl cyclase (WO 95/30012). Adenylyl cyclase is among the best studied of the effector molecules 
which function in mammalian cells in response to activated G proteins. "Activators"of adenylyl cyclase 
cause the enzyme to become more active, elevating the cAMP signal of the yeast cell to a detectable 
degree. 

"Inhibitors"cause the cyclase to become less active, reducing the cAMP signal to a detectable degree. The 
method describes the use of the engineered yeast cells to screen for drugs which activate or inhibit adenylyl 
cyclase by their action on G protein-coupled receptors. 

[00020] When attempting to identify genes encoding bioactivities of interest from complex mixed 
population nucleic acid libraries, the rate limiting steps in discovery occur at the both DNA cloning level 
and at the screening level. Screening of complex mixed population libraries which contain, for example, 
100s of different organisms requires the analysis of several miUion clones to cover this genomic diversity. 
An extremely high-throughput screening method has been developed to handle the enormous numbers of 
clones present in these libraries. 

[00021] In traditional flow cytometry, it is common to analyze very large numbers of eukaryotic cells in a 
short period of time. Newly developed flow cytometers can analyze and sort up to 20,000 cells per second. 
In a typical flow cytometer, individual particles pass through an illumination zone and appropriate 
detectors, gated electronically, measure the magnitude of a pulse representing the extent of light scattered. 
The magnitude of these pulses are sorted electronically into"bins"or" channels", permitting the display of 
histograms of the number of cells possessing a certain quantitative property versus the channel number 
(Davey and Kell, 1996), It was recognized early on that the data accruing from flow cytometric 
measurements could be analyzed (electronically) rapidly enough that electronic cell-sorting procedures 
could be used to sort cells with desired properties into separate"buckets", a procedure usually known as 
fluorescence-activated cell sorting (Davey and Kell, 1996). 

[00022] Fluorescence-activated cell sorting has been primarily used in studies of human and animal cell 
lines and the control of cell culture processes. Fluorophore labeling of cells and measurement of the 
fluorescence can give quantitative data about specific target molecules or subcellular components and their 
distribution in the cell population. Flow cytometry can quantitate virtually any cell-associated property or 
cell organelle for which there is a fluorescent probe (or natural fluorescence). The parameters which can be 
measured have previously been of particular interest in animal cell culture. 

[00023] Flow cytometry has also been used in cloning and selection of variants from existing cell clones. 
This selection, however, has required stains that diffuse through cells passively, rapidly and irreversibly, 
with no toxic effects or other influences on metabolic or physiological processes. Since, typically, flow 
sorting has been used to study animal cell culture performance, physiological state of cells, and the cell 
cycle, one goal of cell sorting has been to keep the cells viable during and after sorting. 

[00024] There currently are no reports in the literature of screening and discovery of recombinant enzymes 
in E. coli expression libraries by fluorescence activated cell sorting of single cells. Furthermore there are no 
reports of recovering DNA encoding bioactivities screened by expression screening in E. coli using a 
FACS machine. 

[00025] A limited number of papers describing various apphcations of flow cytometry in the field of 
microbiology and sorting of fluorescence activated microorganisms have, however, been published (Davey 
and Kell, 1996), Fluorescence and other forms of staining have been employed for microbial discrimination 
and identification, and in the analysis of the interaction of drugs and antibiotics with microbial cells. Flow 
cytometry has been used in aquatic biology, where autofluorescence of photosynthetic pigments are used in 
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the identification of algae or DNA stains are used to quantify and count marine populations (Davey and 
Kell, 1996). Thus, Diaper and Edwards used flow cytometry to detect viable bacteria after staining with a 
range of fluorogenic esters including fluorescein diacetate (FDA) derivatives and CemChrome B, a 
proprietary stain sold commercially for the detection of viable bacteria in suspension (Diaper and Edwards, 
1994). Labeled antibodies and oligonucleotide probes have also been used for these purposes. 

[00026] Papers have also been published describing the application of flow cytometry to the detection of 
native and recombinant enzymatic activities in eukaryotes. Betz et al. studied native (non-recombinant) 
lipase production by the eukaryote, Rhizopus arrhizus with flow cytometry. They found that spore 
suspensions of the mold were heterogeneous as judged by light-scattering data obtained with excitation at 
633 nm, and they sorted clones of the subpopulations into the wells of microtiter plates. After germination 
and growth, lipase production was automatically assayed (turbidimetrically) in the microtiter plates, and a 
representative set of the most active were re-isolated, cultured, and assayed conventionally (Betz et al., 
1984). 

[00027] Scrienc et al. have reported a flow cytometric method for detecting cloned- galactosidase activity 
in the eukaryotic orgaaism, S. cerevisiae. The ability of flow cj^ometry to make measurements on single 
cells means that individual cells with high levels of expression (e. g. , due to gene amplification or higher 
plasmid copy number) could be detected. In the method reported, a non-fluorescent compound 6-naphthol- 
B- galactopyranoside) is cleaved by P-galactosidase and the liberated naphthol is trapped to form an 
insoluble fluorescent product. The insolubility of the fluorescent product is of great importance here to 
prevent its diffiision from the cell. Such diffusion would not only lead to an underestimation of (3- 
galactosidase activity in highly active cells but could also lead to an overestimation of enzyme activity in 
inactive cells or those with low activity, as they may take up the leaked fluorescent compound, thus 
reducing the apparent heterogeneity of the population. 

[00028] One group has described the use of a FACS machine in an assay detecting fiision proteins 
expressed from a speciaUzed transducing bacteriophage in the prokaryote Bacillus subtilis (see, e. g. , 
Chung, et. al. , J. of Bacteriology, Apr. 1994, p. 1977-1984 ; Chung, et. al.. Biotechnology and 
Bioengineering, Vol. 47, pp. 234-242 (1995) ). This group monitored the expression of a lacZ gene 
(encodes beta-galactosidase) fused to the sporulation loci in subtilis (spo). The technique used to monitor 
beta-galactosidase expression from spo-lacZ fiisions in single cells involved taking samples from a 
sporulating culture, staining them with a commercially available fluorogenic substrate for beta- 
galactosidase called C8-FDQ and quantitatively analyzing fluorescence in single cells by flow cytometry. 
In this study, the flow cytometer was used as a detector to screen for the presence of the spo gene during 
the development of the cells. The device was not used to screen and recover positive cells from a gene 
expression library or nucleic acid for the purpose of discovery. 

[00029] Another group has utilized flow cytometry to distinguish between the developmental stages of the 
delta-proteobacteria Myxococcus xanthus (F. Russo-Marie, et. al. , PNAS, Vol. 90, pp. 8194-8198, 
September 1993). As in the previously described study, this study employed the capabilities of the FACS 
machine to detect and distinguish genotypically identical cells in different development regulatory states. 
The screening of an enzymatic activity was used in this study as an indirect measure of developmental 
changes. 

[00030] The lacZ gene from E. coli is often used as a reporter gene in studies of gene expression regulation, 
such as those to determine promoter efficiency, the effects of trans- acting factors, and the effects of other 
regulatory elements in bacterial, yeast, and animal cells. Using a chromogenic substrate, such as ONPG (o- 
nitrophenyl- (-D-galactopyranoside), one can measure expression of-galactosidase in cell cultures; but it is 
not possible to monitor expression in individual cells and to analyze the heterogeneity of expression in cell 
populations. The use of fluorogenic substrates, however, makes it possible to determine galactosidase 
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activity in a large number of individual cells by means of flow cytometry. This type of determination can 
be more informative with regard to the physiology of the cells, since gene expression can be correlated with 
the stage in the mitotic cycle or the viability under certain conditions. In 1994, Plovins et al. , reported the 
use of fluorescein-Di-p-D- galactopyranoside (FDG) and C12-FDG as substrates for p-galactosidase 
detection in animal, bacterial, and yeast cells. This study compared the two molecules as substrates for (3- 
galactosidase, and concluded that FDG is a better substrate for B-galactosidase detection by flow cytometry 
in bacterial cells. The screening performed in this study was for the comparison of the two substrates. The 
detection capabilities of a FACS machine were employed to perform the study on viable bacterial cells. 

[00031] Cells with chromogenic or fluorogenic substrates yield colored and fluorescent products, 
respectively. Previously, it had been thought that the flow cytometry -fluorescence activated cell sorter 
approaches could be of benefit only for the analysis of cells that contain intracellularly, or are normally 
physically associated with, the enzymatic activity of small molecule of interest. On this basis, one could 
only use fluorogenic reagents which could penetrate the cell and which are thus potentially cytotoxic. To 
avoid clumping of heterogeneous cells, it is desirable in flow cytometry to analyze only individual cells, 
and this could limit the sensitivity and therefore the concentration of target molecules that can be sensed. 
Weaver and his colleagues at MIT and others have developed the use of gel microdroplets containing 
(physically) single cells which can take up nutrients, secret products, and grow to form colonies. The 
diffusional properties of gel microdroplets may be made such that sufficient extracellular product remains 
associated with each individual gel microdroplet, so as to permit flow cytometric analysis and cell sorting 
on the basis of concentration of secreted molecule within each microdroplet. Beads have also been used to 
isolate mutants growing at different rates, and to analyze antibody secretion by hybridoma cells and the 
nutrient sensitivity of hybridoma cells. The gel microdroplet method has also been applied to the rapid 
analysis of mycobacterial growth and its inhibition by antibiotics. 

[00032] There are several hurdles which must be overcome when attempting to detect and sort expressing 
recombinant enzymes, and recover encoding nucleic acids. FACS systems have typically been based on 
eukaryotic separations and have not been refined to accurately sort single E. coli cells ; the low forward and 
sideward scatter of small particles like E. coli, reduces the ability of accurate sorting; enzyme substrates 
typically used in automated screening approaches, such as umbelifferyl based substrates, diffuse out of E. 
coli at rates which interfere with quantitation. Further, recovery of very small amoimts of DNA firom sorted 
organisms can be problematic. 

[00033] There has been a dramatic increase in the need for bioactive compounds with novel activities. This 
demand has arisen largely fi-om changes in worldwide demographics coupled with the clear and increasing 
trend in the number of pathogenic organisms that are resistant to currently available antibiotics as well as 
the need for new industrial processes for synthesis of compounds. For example, while there has been a 
surge in demand for antibacterial drugs in emerging nations with yoimg populations, countries with aging 
populations, such as the U. S., require a growing repertoire of drugs against cancer, diabetes, arthritis and 
other debilitating conditions. The death rate firom infectious diseases has increased 58% between 1980 and 
1992 and it has been estimated that the emergence of antibiotic resistant microbes has added in excess of 
$30 billion annually to the cost of health care in the U. S. alone, (see, e. g. , Adams et al., Chemical and 
Engineering News, 1995 ; Amann et al. , Microbiological Reviews, 59, 1995). As a response to this trend, 
pharmaceutical companies have significantly increased their screening of microbial diversity for 
compounds with unique activities or specificities. 

[00034] The majority of bioactive compoimds currently in use are derived from soil microorganisms. Many 
microbes inhabiting soils and other complex ecological communities produce a variety of compounds that 
increase their ability to survive and proliferate. These compoimds are generally thought to be nonessential 
for growth of the organism and are synthesized with the aid of genes involved in intermediary metabolism. 
Such secondary metabolites that influence the growth or survival of other organisms are known 
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as"bioactive" compounds and serve as key components of the chemical defense arsenal of both micro-and 
macroorganisms. Humans have exploited these compounds for use as antibiotics, anti- infectives and other 
bioactive compounds with activity against a broad range of prokaryotic and eukaryotic pathogens (Bames 
et al. , Proc. Nat. Acad. Sci. U. S. A. , 91,1994). 

[00035] The approach currently used to screen microbes for new bioactive compounds has been largely 
xmchanged since the inception of the field. New isolates of bacteria, particularly gram positive strains from 
soil environments, are collected and their metabolites tested for pharmacological activity. 

[00036] There is still tremendous biodiversity that remains untapped as the source of lead compoimds. 
However, the currently available methods for screening and producing lead compounds cannot be applied 
efficiently to these under-explored resources. For instance, it is estimated that at least 99% of marine 
bacteria species do not survive on laboratory media, and commercially available fermentation equipment is 
not optimal for use in the conditions under which these species will grow, hence these organisms are 
difficult or impossible to culture for screening or re-supply. Recollection, growth, strain improvement, 
media improvement and scale-up production of the drug-producing organisms often pose problems for 
synthesis and development of lead compounds. Furthermore, the need for the interaction of specific 
organisms to synthesize some compounds makes their use in discovery extremely difficult. 

New methods to hamess the genetic resources and chemical diversity of these untapped sources of 
compounds for use in drug discovery are very valuable. 

[00037] A central core of modem biology is that genetic information resides in a nucleic acid genome, and 
that the information embodied in such a genome (i. e. , the genotype) directs cell function. This occurs 
through the expression of various genes in the genome of an organism and regulation of the expression of 
such genes. The expression of genes in a cell or organism defines the cell or organism's physical 
characteristics (i. e. , its phenotype). This is accomplished through the translation of genes into proteins. 
Determining the biological activity of a protein obtained from an environmental sample can provide 
valuable information about the role of proteins in the environments. In addition, such information can help 
in the development of biologies, diagnostics, therapeutics, and compositions for industrial and agricultural 
applications. 

[00038] In the United States, cancer is the second leading cause of disease-related deaths, second only to 
cardiovascular disease and it is projected to become the leading cause of death within a few years. The 
most common curative therapies for cancers foxmd at an early stage include surgery and radiation (1). 
These methods are not nearly as successful in the more advanced stages of cancer. Current 
chemotherapeutic agents have been useful but are limited in their effectiveness. Significant results are 
obtained with chemotherapy in a small range of cancers including childhood cancers and certain adult 
malignancies such as lymphoma and leukemia (2). Despite these positive results, most chemotherapeutic 
treatments are not curative and serve primarily as palliatives (1). Thus, it is clear that current medical 
science still has a long way to go before providing long-term survival to patients and curability of most 
cancers. However, basic research over the past 20 years has provided a vast amount of scientific 
information defining key players in the progression of cancers. Understanding the disease processes at the 
molecular level provides the means to determine optimal molecular targets and presumably selectively kill 
cancerous tissues. Some of the key areas that have been identified in the progression of tumors include 
proliferative signal transduction, aberrant cell-cycle regulation; apoptosis, telomere biology, genetic 
instability and angiogenesis (3). 

This basic research is now beginning to pay off as progress towards more effective treatments is beginning 
to emerge (4,5). New chemotherapeutic agents directed against these identified areas are in Phase I-III 
clinical trials with some of the most promising agents active against tyrosine kinases involved in signal 
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transduction. Small molecule inhibitors of Bcr-abl, protein kinase C, VEGF receptors, and EGF receptors, 
to name a few, are all in clinical trials (4). Some specific examples include the EGF receptor inhibitors, 
ZD 1 839 and CP358774, which are in Phase II trials and appear to be well tolerated by patients with 
positive signs of clinical activity (6). Even with this progress, the complexities of tumorigenesis necessitate 
not only the ongoing discovery and development of novel therapeutic agents but also the basic research to 
elucidate the underlying mechanisms of the disease. Presently, there are at least 50 known cancer related 
targets and it has been speculated that there may be up to several hundred new targets discovered (2). To 
make use of this influx of information, novel methods for the ultra high throughput screening of potential 
anti-cancer drugs must be developed. 

[00039] Recent technological developments in molecular biology, automation, miniaturization, and 
information technology have facilitated the high throughput screening of novel compounds from a variety 
of sources. However, despite the increased throughput, there is some disappointment in the industry 
regarding the number of novel drugs that have resulted from these efforts (7). One of the significant 
challenges is to find sufficient numbers of compounds with the structural diversity necessary to increase the 
chances of finding activity at the molecular target. Currently, screened compounds come from chemical 
and combinatorial libraries, historical compound collections and natural product libraries (8). Of these, one 
of the richest sources of drugs has been from natural product libraries. Cragg et al (9) reported that over 
60% of the approved anticancer drugs and pre-NDA candidates between 1984 and 1995 were from natural 
sources or derived from natural products. In fact, it is estimated that 39% of all 520 new approved drugs 
during this time period were from or derived from natural products with 80% of anti-infectives coming 
from nature. Typically, natural products are small molecules that have a much greater stmctural diversity 
than most combinatorial approaches. Small molecules in general are favored by the pharmaceutical 
industry because they are more"drug-like"in nature with the ability to penetrate tumors, be absorbed, and 
metabolized easily. However, natural products have their disadvantages, largely due to the reproducibility 
of the source, the labor-intensive extraction process, the abundance of the supply, and the concerns over 
rights to biodiversity (8). 

[00040] The therapeutic agents from natural sources have been primarily of plant and microbial origins. Of 
these, the greatest biodiversity exists in the microorganisms that populate virtually every comer of the 
earth. The approach currently used to screen microbes for new bioactive compounds has changed little over 
the last 50 years. Microbiologists collect samples from the environment, isolate a pure culture, grow up 
sufficient material, extract the culture, and test their metabolites for pharmacological activity. Variations of 
these natural products can then be generated through mutagenesis of the producing organism or through 
chemical or biochemical modification of the original backbone molecules. 

Natural products are typically made by multi-enzyme systems in which each enzyme carries out one of the 
many transformations required to make the final small molecule products, an example being antibiotics. 
These bioactive molecules are derived from the organism's ability to produce secondary metabolites in 
response to the specific needs and challenges of their local environments. The genes encoding these 
enzymes are often clustered into so-called "biosynthetic operons"which contain the blueprint for building a 
natural product (10). This blueprint for production of a small bioactive molecule is typically more than 
25,000 nucleotides and can be greater than 100,000 nucleotides. There are many examples of entire 
pathways encoding for the production of such small molecules as oxytetracycline, jadomycin, 
daunorabicin, to name just a few, that have been cloned as contiguous pieces of DNA from a producing 
organism (1 1). Some of these pathways (e. g. actinorhodin, tetracenomycin, puromycin, nikkomycin) have 
been transferred to other microbial hosts and the small molecule heterologously expressed (11). 

[00041] A more recent approach has been to use recombinant techniques to synthesize hybrid antibiotic 
pathways by combining gene subunits from previously characterized pathways. This approach, 
called"combinatorial biosynthesis"has been focused primarily on the polyketide antibiotics and has resulted 
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in a number of compounds which have displayed activity (12,13). In one such approach using the 
erythronolide biosynthetic operon, enzymatic domains have been added to (14) and repositioned within the 
operon (15), thereby reprogramming polyketide biosynthesis. However, compounds with novel antibiotic 
activities have not yet been reported: an observation that may be due to the fact that the pathway subunits 
are derived from those encoding previously characterized compounds. 

What has not been accoimted for in previous attempts to discover novel bioactive compounds is the 
relatively recent observation that only a small fraction of microbes in natural environments can be grown 
under laboratory conditions. Estimates are that far less than 1% of all prokaryotes are capable of being 
grown in pure culture in the laboratory. This implies a need for culture-independent methods for bioactive 
compound discovery. 

[00042] Culture-independent approaches to directly clone genes encoding both target enzymes and other 
bioactive molecules from environmental samples are based on the construction of libraries which represent 
the collective genomes of naturally occurring' organisms, archived in cloning vectors that can be 
propagated in E. coli, Streptomyces, or other suitable hosts. Because the cloned DNA is initially extracted 
directly from environmental samples containing a mixed population of organisms, the representation of the 
libraries is not limited to the small fraction of prokaryotes that can be grown in pure culture, nor is it biased 
towards a few rapidly growing species. Samples can be obtained from virtually all ecosystems represented 
on earth, including such extreme environments as geothermal and hydrothermal vents, acidic soils and 
boiling mud pots, contaminated industrial sites, marine symbionts, etc. 

[00043] Screening of complex mixed population libraries containing, for example, 100 different organisms 
requires the analysis of tens of millions of clones to cover the genomic diversity. An extremely high 
throughput screening method must be implemented to handle the enormous numbers of clones present in 
these libraries. In the pharmaceutical industry today, high throughput screening typically has throughput 
rates on the order of 10,000 compounds per assay per day with some laboratories working at 100,000 
assays per day. 

Most of the development in the industry has centered around the miniaturization and automation of these 
screens to higher density, smaller volume plate formats. However, this strategy could be reaching the 
practical limits of conventional liquid-dispensing technology and current microplate fabrication processes, 
as well as the limits in controlling evaporation in open systems with very small well volumes. 

[00044] Current platforms for screening micro-scale particles of interest include plates that are formed with 
small wells, or tlu-ough-holes. The wells or through-holes are used to hold a sample to be analyzed. The 
sample typically contains the particles of interest. When wells are used, complex and inefficient sample 
delivery and extraction systems must be used in order to deposit the sample into the wells on the plate, and 
remove the sample from the wells for further analysis. Wells-based platforms have a bottom, for which 
gravity is primarily used for suspending the sample on the plate to develop the particulate or incubate cells 
of interest. 

[00045] Another type of platform uses through-holes, which are typically machined into a plate by one of a 
number of well-known methods. Through-holes rely on capillary forces for introducing the sample to the 
plate, and utilize surface tension for suspending the sample in the through-holes. However, typical through- 
hole-based devices are limited to relatively small aspect ratios, or the ratio of length to internal diameter of 
the hole. A small aspect ratio yields greater evaporative loss of a liquid contained in the hole, and such 
evaporation is difficult to control. Through-holes are also limited in their functionality. For example, the 
process of forming through-holes in a plate usually does not allow for the use of various materials to line 
the inside of the holes, or to clad the outside of the holes . 
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[00046] Fluorescence and other forms of staining have been employed for microbial discrimination and 
identification, and in the analysis of the interaction of drugs and antibiotics with microbial cells. Flow 
cytometry has been used in aquatic biology, where autofluorescence of photosynthetic pigments are used in 
the identification of algae or DNA stains are used to quantify and count marine populations (Davey and 
Kell, 1996). Diaper and Edwards used flow cytometry to detect viable bacteria after staining with a range 
of fluorogenic esters including fluorescein diacetate (FDA) derivatives and CemChrome B, a stain sold 
commercially for the detection of viable bacteria in suspension (Diaper and Edwards, 1994). Labeled 
antibodies and oligonucleotide probes can also been used for these purposes. 

[00047] Papers have been published describing the application of flow cytometry to the detection of native 
and recombinant enzymatic activities in eukaryotes. Betz et al. studied native (non-recombinant) lipase 
production by the eukaryote, Rhizopus arrhizus with flow cytometry. They found that spore suspensions of 
the mold were heterogeneous as judged by light-scattering data obtained with excitation at 633 nm, and 
they sorted clones of the subpopulations into the wells of microtiter plates. After germination and growth, 
lipase production was automatically assayed (turbidimetrically) in the microtiter plates, and a representative 
set of the most active were re-isolated, cultured, and assayed conventionally (Betz et al. , 1984). The ability 
of flow cytometry to make measurements on single cells means that individual cells with high levels of 
expression (e. g. , due to gene amplification or higher plasmid copy number) could be detected. 

[00048] Cells with chromogenic or fluorogenic substrates yield colored and fluorescent products, 
respectively. Previously, it had been thought that the flow cytometry-fluorescence activated cell sorter 
approaches could be of benefit only for the analysis of cells that contain intracellularly, or are normally 
physically associated with, the enzymatic activity of a molecule of interest. On this basis, one could only 
use fluorogenic reagents which could penetrate the cell and which are thus potentially cytotoxic. In 
addition, gel microdroplets (GMDs) can be used during FACS sorting and culturing. 

All classes of molecules and compounds that are utilized in both established and emerging chemical, 
pharmaceutical, textile, food and feed, detergent markets must meet economical and environmental 
standards. The synthesis of polymers, pharmaceuticals, natural products and agrochemicals is often 
hampered by expensive processes which produce harmful byproducts and which suffer from poor or 
inefficient catalysis. Enzymes, for example, have a number of remarkable advantages which can overcome 
these problems in catalysis: they act on single functional groups, they distinguish between similar 
functional groups on a single molecule, and they distinguish between enantiomers. Moreover, they are 
biodegradable and function at very low mole fractions in reaction mixtures. Because of their chemo-, regio- 
and stereospecificity, enzymes present a unique opportunity to optimally achieve desired selective 
transformations. These are often extremely difficult to duplicate chemically, especially in single-step 
reactions. The elimination of the need for protection groups, selectivity, the ability to carry out multi-step 
transformations in a single reaction vessel, along with the concomitant reduction in environmental burden, 
has led to the increased demand for enzymes in chemical and pharmaceutical industries. Enzyme-based 
processes have been gradually replacing many conventional chemical-based methods. A current limitation 
to more widespread industrial use is primarily due to the relatively small number of commercially available 
enzymes. Only-300 enzymes (excluding DNA modifying enzymes) are at present commercially available 
fi-om the > 3000 non DNA-modifying en2yme activities thus far described. 

[00049] The use of enzymes for technological applications also may require performance under demanding 
industrial conditions. This includes activities in environments or on substrates for which the currently 
known arsenal of enzymes was not evolutionarily selected. 

However, the natural environment provides extreme conditions including, for example, extremes in 
temperature and pH. A number of organisms have adapted to these conditions due in part to selection for 
polypeptides than can withstand these extremes. 
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[00050] Enzymes have evolved by selective pressvire to perform very specific biological functions within 
the milieu of a living organism, under conditions of temperature, pH and salt concentration. For the most 
part, the non-DNA modifying enzyme activities thus far described have been isolated from mesophilic 
organisms, which represent a very small fraction of the available phylogenetic diversity. The dynamic field 
of biocatalysis takes on a new dimension with the help of enzymes isolated from microorganisms that 
thrive in extreme environments. For example, such enzymes must function at temperatures above 100°C in 
terrestrial hot springs and deep sea thermal vents, at temperatures below 0°C in arctic waters, in the 
saturated salt environment of the Dead Sea, at pH values around 0 in coal deposits and geothermal sulfur- 
rich springs, or at pH values greater than 1 1 in sewage sludge. 

Environmental samples obtained, for example, from extreme conditions containing organisms, 
polynucleotides or polypeptides (e. g. , enzymes) open a new field in biocatalysis. 

[0005 1] In addition to the need for new enzymes for industrial use, there has been a dramatic increase in 
the need for bioactive compounds with novel activities. This demand has arisen largely fi*om changes in 
worldwide demographics coupled with the clear and increasing trend in the number of pathogenic 
organisms that are resistant to currently available antibiotics. For example, while there has been a surge in 
demand for antibacterial drugs in emerging nations with young populations, countries with aging 
populations, such as the U. S. , require a growing repertoire of drugs against cancer, diabetes, arthritis and 
other debilitating conditions. The death rate from infectious diseases has increased 58% between 1980 and 
1992 and it has been estimated that the emergence of antibiotic resistant microbes has added in excess of 
$30 billion annually to the cost of health care in the U. S. alone. 

(Adams et al., Chemical and Engineering News, 1995; Amann et al. , Microbiological Reviews, 59,1995). 
As a response to this trend pharmaceutical companies have significantly increased their screening of 
microbial diversity for compounds with unique activities or specificity. 

[00052] The majority of bioactive compounds currently in use are derived from soil microorganisms. Many 
microbes inhabiting soils and other complex ecological communities produce a variety of compounds that 
increase their ability to survive and proliferate. These compounds are generally thought to be nonessential 
for growth of the organism and are synthesized with the aid of genes involved in intermediary metabolism 
hence their name- "secondary metabolites". Secondary metabolites are generally the products of complex 
biosynthetic pathways and are usually derived from common cellular precursors. Secondary metabolites 
that influence the growth or survival of other organisms are known as"bioactive" compounds and serve as 
key components of the chemical defense arsenal of both micro-and macro-organisms. Humans have 
exploited these compounds for use as antibiotics, anti- infectives and other bioactive compounds with 
activity against a broad range of prokaryotic and eukaryotic pathogens. Approximately 6,000 bioactive 
compounds of microbial origin have been characterized, with more than 60% produced by the gram 
positive soil bacteria of the genus Streptomyces. (Bames et al., Proc. Nat. Acad. Sci. U. S. A. , 91,1994). 
Of these, at least 70 are currently used for biomedical and agricultural applications. The largest class of 
bioactive compounds, the polyketides, include a broad range of antibiotics, immunosuppressants and 
anticancer agents which togetiier account for sales of over $5 billion per year. 

[00053] Despite the seemingly large number of available bioactive compounds, it is clear that one of the 
greatest challenges facing modem biomedical science is the proliferation of antibiotic resistant pathogens. 
Because of their short generation time and ability to readily exchange genetic information, pathogenic 
microbes have rapidly evolved and disseminated resistance mechanisms against virtually all classes of 
antibiotic compounds. For example, there are virulent strains of the human pathogens Staphylococcus and 
Streptococcus that can now be treated with but a single antibiotic, vancomycin, and resistance to this 
compound will require only the transfer of a single gene, vanA, from resistant Enterococcus species for this 
to occur. (Bateson et al. , System. Appl. Microbiol, 12, 1989). When this crucial need for novel 



file ://C :\My%20Documents\WIPO\WO-05-0 1 0 1 69.html 



9/14/06 



Page 14 of 123 



antibacterial compounds is superimposed on the growing demand for enzyme inhibitors, 
immunosuppressants and anti-cancer agents it becomes readily apparent why pharmaceutical companies 
have stepped up their screening of microbial samples for bioactive compounds. 

[00054] Conventional screening methods include liquid phase, microtiter plate based assays. The format 
for Hquid phase assays is often robotically manipulated 96,384, or 1536- well microtiter plates. Although 
these microtiter plate based screening technologies are being used successfully, limitations do exist. The 
primary limitation is throughput as these techniques generally allow the screening of only about 105 to 106 
clones/day /instrument. For example, a typical screen of 100,000 wells on a microtiter based HTS systems 
requires 261,384-well microtiter plates and over 24 hours of equipment time. However, while 1536« well or 
greater plate formats are growing in popularity, the majority of companies involved in HTS continue to use 
3 84- well plates, as this technology is reliable and standardized. While these throughputs may be more than 
sufficient for screening isolate and low-complexity libraries, it could take more than a year to thoroughly 
screen one complex gene library. 

Clearly, higher throughput screening technology is necessary. 

[00055] Other screening methods include growth selection (Snustad et al., 1988 ; Lundberg et al. , 1993; 
Yano et al. , 1998), colorimetric screening of bacterial colonies or phage plaques (Kuritz, 1999), in vitro 
expression cloning (King et al. , 1997) and cell surface or phage display (Benhar, 2001). Each of these 
systems has limitations. Solid phase colorimetric plate screening of colonies or plaques is limited by 
relatively low throughput. Even with the use of micro-colonies/plaques and automated imaging and clone 
recovery, thorough screening of complex libraries is impractical. Cell surface and/or phage display 
technologies suffer from structural limitations of the displayed molecule. Often the size and/or shape of the 
displayed molecule is restricted by the display technology. One of the highest throughput screening 
methods, growth selection, is also limited in its scope of usefulness. Assay conditions, temperature and pH, 
are limited by the growth parameters of the host strain. Molecular interactions are often constrained by the 
host cell membranes and/or cell wall, as substrate must be presented to intracellular enzymes. In addition, 
"false positives"or a high level of "background"are a common occurrence in many selection assays. With 
respect to screening for improved variants in GSSMTM or GeneReassembly libraries, growth selection is 
seldom quantitative. 

[00056] Classification of microorganisms based on rRNA analysis has shown that the majority of microbes 
present in nature have no counterpart among previously cultured organisms. Establishing the metabolic 
properties and potential of this microbial diversity in the absence of pure culture presents an immense 
challenge for microbial ecologists. 

Although 16S rRNA studies combined with genomic analyses of naturally occurring marine 
bacterioplankton has suggested the existence of novel metabolic functions, a comprehensive understanding 
of the physiology of these organisms, and of the complex environmental processes in which they engage, 
will undoubtedly require their cultivation. 

[00057] Conventional cultivation of microorganisms is laborious, time consuming and, most importantly, 
selective and biased for the growth of specific microorganisms. The majority of cells obtained fi-om nature 
and visualized by microscopy are viable, but they do not generally form visible colonies on plates. This 
may reflect the artificial conditions inherent most culture media, for example extremely high substrate 
concentrations, or the lack of specific nutrients required for growth. Consistent with this, it was shown 
recently that certain previously uncultivable microorganisms could be grown in pure culture if provided 
with the chemical components of their natural environment. 
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SUMMARY OF THE INVENTION [00058] The invention provides systems (products of manufacture) 
and methods for use in the biotechnology and chemical industry for identifying molecules that can be used 
in biological or chemical processes (e. g. , enzymes). The invention provides systems and methods for 
identifying novel enzymes in a mixed population environmental sample. By rapidly identifying 
polypeptides having an activity of interest and polynucleotides encoding the polypeptide of interest, the 
invention provides methods, compositions and sources for the development of biologies, diagnostics, 
therapeutics, and compositions for industrial applications. 

[00059] The invention provides methods (and systems for practicing these methods) for maintaining a cell 
from a population of cells comprising: encapsulating or enclosing within a microenvironment at least a 
single cell from a population of cells, wherein the microenvironment is porous, e. g. , allows aqueous 
nutrients (e. g. , media), small molecules, growth factors, hormones, probes, proteins (e. g. , antibodies, 
cytokines, etc. ), plasmids and/or fosmids to diffuse, or flow, through its interior, or, wherein the 
microenvironment allows exchange of aqueous nutrients from the exterior to the interior of the 
microenvironment ; placing the microenvironment comprising the cell into a containment device ; and 
incubating the microenvironment in the containment device under conditions allowing the cell to survive 
and be maintained, wherein conditions allowing the maintained cell to survive comprise exchange of 
aqueous nutrients from the exterior to the interior of the microenvironment, or, diffrising or flowing an 
aqueous nutrient mixture through the containment device, thereby maintaining the cell. The population of 
cells can comprise a mixed population of cells or an uncultivated population of cells. The mixed population 
of cells can be uncultivated or undefined. The population of cells can be derived from an environmental 
sample, such as from geothermal fields, hydrothermal fields, acidic soils, sulfotara mud pots, boiling mud 
pots, pools, hot-springs, geysers, marine actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical 
soil, temperate soil, arid soil, compost piles, manure piles, marine sediments, freshwater sediments, water 
concentrates, hypersaline sea ice, super-cooled sea ice, arctic tundra. Sargasso sea, open ocean pelagic, 
marine snow, microbial mats, whale falls, springs, hydrothermal vents, insect and nematode gut microbial 
communities, plant endophytes, epiphytic water samples, industrial sites or ex situ enrichments, and the 
like. The environmental sample can comprise air, water, sediment, soil or rock. 

[00060] In one aspect of the systems and methods of the invention, the population of cells comprises at 
least one eukaryote cell, prokaryote cell, myxobacteria (epothilone) cell, yeast cell, archaeal cell, plant cell, 
mammalian cell, insect cell or protozoan cell. The population of cells can comprise a mixture of materials, 
such as a biological sample, soil or sludge. The biological sample can comprise a plant sample, a food 
sample, a gut sample, a salivary sample, a blood sample, a sweat sample, a urine sample, a spinal fluid 
sample, a tissue sample, a vaginal swab, a stool sample, an amniotic fluid sample or a buccal mouthwash 
sample. The encapsulated or enclosed cell can be a microorganism. The microorganism can be a bacterial 
cell, a yeast cell, an archaeal cell, a plant cell, a mammalian cell, an insect cell or a protozoan cell. The 
encapsulated or enclosed cell can be an extremophile, such as hyperthermophiles, psychrophiles, 
halophiles, psychrotrophs, alkalophiles and acidophiles. 

[00061] In one aspect of the systems and methods, the microenvironment is designed/ manufactured such 
that fluid (e. g. , an aqueous nutrient mixture, such as a media) and/or a probe (e. g. , an antibody or a 
nucleic acid probe, e. g. , an oligonucleotide, a fosmid, a PGR primer, and the like) can pass into and out of 
the microenvironment. In other words, the microenvironment is designed/manufactured to be porous. In 
one aspect, the microenvironment comprises a porous gel, such as a porous gel microdroplet (GMD). Thus, 
microenvironment used to practice the systems and methods of the invention can be of varying porosities, 
e. g. , they can be designed/manufactured to allow molecules (e. g., nutrients, hormones, cytokines, probes, 
etc. , such as proteins (e. g. , antibodies), plasmids or fosmids) of any desired size pass into or out of the 
microenvironment (i. e. , designed such that molecules over a certain size can or cannot pass through the 
microenvironment. 
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[00062] In one aspect, one or more cells can be encapsulated or enclosed in each porous gel microdroplet 
(GMD). The porous gel microdroplet (GMD) can comprise a CELMIXTM emulsion matrix or a 
CELGELTM encapsulation matrix. The microenvironment can comprise a hydrogel matrix, a porous 
membrane or a selectively permeable membrane, a microfluidic channel, a liposome or a ghost cell, a (per) 
fluorinated amorphous polymer, or a capillary array. The capillary array can comprise a non-addressable 
capillary array or a GIGAMATRIXTM array. The microenvironment can comprise a porous 
chromatographic membrane or a three-dimensional porous structure. In one aspect, two, three, four, five, 
six, seven, eight, nine, ten or more cells are encapsulated or enclosed in each microenvironment. 

[00063] In one aspect of the systems and methods, the microenvironment comprises a growth column or 
equivalent (e. g. , a variation/modification of a chromatography column). 

The growth column can comprise a capillary, e. g. , a capillary array. The capillary array can comprise a 
non-addressable high throughput capillary-based array in a holding plate. The non-addressable capillary- 
based array can be GIGAMATRIXTM. The growth column can comprise a chromatography column or 
variation/modification thereof. 

[000454] In one aspect of the systems and methods, the conditions allowing the maintained cell survive 
comprise providing nutrients at in situ concentrations. In one aspect, conditions allowing the maintained 
cell survive are equivalent to environmental conditions from which. the cell was initially derived, for 
example, the environmental conditions can be the equivalent of a geothermal field, a hydrothermal field, an 
acidic soil, a sulfotara mud pot, a boiUng mud pot, a pools, a hot-spring, a geyser, a tropical soil, a 
temperate soil, an arid soil, a compost pile, a manure piles, a marine sediment, a freshwater sediment, a 
water concentrate, a hypersaline sea ice, a super-cooled sea ice, an arctic tundra, a fresh water envirormient, 
a salt water marine environment, an open ocean pelagic envirormient, a marine snow, a microbial mat, a 
whale fall, a spring or a hydrothermal vent. 

[00065] In one aspect, the aqueous nutrient mixture diffuses, or, flows, through the containment device 
such that the microenvironments are suspended in the containment device, circulated in the contairmient 
device or agitated in the containment device. The containment device can comprise a system comprising an 
influx port and an efflux port for the aqueous nutrient media. The influx port can be positioned at the 
bottom of the system arid the efflux port can be positioned above the influx port. In one aspect, the influx 
port is positioned at the bottom of a column and the efflux port is positioned at the top of the column. The 
system can also be fitted/designed such that it can recycle or re-circulate aqueous nutrient mixture (e. g. , 
media). The system can also comprise a collection device for media leaving the efflux port. The system can 
comprise a membrane or filter to prevent microenvironments from passing out of the containment device 
when the aqueous nutrient mixture is cycled out of the containment device. The system can also be 
fitted/designed such that waste is removed or modified from the media before recycling the aqueous 
nutrient mixture. 

[00066] The method and systems can further comprise incubating and culturing the cell in the 
microenvironment under conditions allowing growth or proliferation of the cell into a micro-colony 
comprising at least two daughter cells. The micro-colony can comprise between about 4 and 10,4 and 50, 4 
and 100, or 4 and 1000 or more cells. 

[00067] The method can further comprise isolating a microenvirorunent. The method can further comprise 
isolating a microenvironment and isolating a cell or a micro-colony from the microenvirormient, or, further 
comprise isolating a cell from the micro-colony. In one aspect, isolating a microenvironment comprises 
sorting an encapsulated or enclosed micro-colony by its size. In one aspect, isolating a microenvirormient 
comprises sorting an encapsulated or enclosed micro-colony by the number of cells in the micro-colony. In 
one aspect, isolating a microenvironment comprises sorting an encapsulated or enclosed micro-colony 
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based on whether or not at least one cell in the micro-colony expresses a marker. The marker can be a 
nucleic acid, a carbohydrate, a small molecule or a protein. The marker can be a detectable probe, such as a 
labeled nucleic acid probe or a labeled antibody or other binding molecule. 

The labeled nucleic acid probe can be detected by FISH or flow cytometry. 

[00068] In one aspect, sorting a micro-colony by size or by the number of cells in the micro-colony 
comprises using flow cytometry, such as FACS. The method can further comprise maintaining the isolated 
cell by re-encapsulating or re-enclosing the cell in a microenvironment and re-culturing. In one aspect, 
between about 1 and 16,1 and 100,4 and 64,16 and 200 or more cells are maintained in each re- 
encapsulated micro-colony. 

[00069] The method can further comprise screening the interactions between enclosed or encapsulated 
cells. The method can further comprise re-culturing the isolated microenvironment under the same or 
different conditions. The method can further comprise direct amplification of a nucleic acid from an 
enclosed or encapsulated cell. The method can fiirther comprise direct amplification of a nucleic acid from 
a cultivated, encapsulated cells. 

In one aspect, isolating a microenvironment comprises use of Microcapsule (MiC) in situ hybridization. 
Isolating a microenvironment can comprise use of Rolling Circle Amplification (RCA). In one aspect, 
isolating a microenvironment comprises use of Large Insert FACS Biopanning (LIFB) FISH or 
fluorescence detection. 

[00070] The invention provides methods and systems for identifying a polynucleotide encoding an activity 
of interest comprising encapsulating or enclosing in a microenvironment at least a single cell from a mixed 
population of cells or an uncultivated population of cells, wherein the microenvironment is porous, e. g. , 
allows exchange of aqueous nutrients (e. g., media), hormones, small molecules, growth factors, probes, 
proteins (e. g. , antibodies, cytokines, etc. ), plasmids and/or fosmids, from the exterior to the interior of the 
microenvironment, or, to diffuse, or flow, through its interior; placing the encapsulated cell in a 
containment device; incubating the encapsulated or enclosed cell in the microenvironment under conditions 
allowing the encapsulated or enclosed cell to survive and be maintained, wherein conditions allowing the 
maintained cell to survive comprise exchange of aqueous nutrients from the exterior to the interior of the 
microenvironment, or, diffusing or flowing an aqueous nutrient mixture through the containment device ; 
contacting a nucleic acid isolated or derived-from the encapsulated cell with at least one nucleic acid probe 
comprising a detectable label, wherein the nucleic acid probe is capable of specifically hybridizing to a 
polynucleotide of interest ; and detecting a specific hybridization between a nucleic acid isolated or derived 
from the cell and the nucleic acid probe, thereby identifying a polynucleotide of interest. 

[00071] In one aspect, the method further comprises enriching for a polynucleotide encoding an activity of 
interest by isolating or amplifying the nucleic acid identified by the specific hybridization between the 
nucleic acid isolated or derived from the encapsulated cell and the nucleic acid probe. The method can 
further comprise use of Large Insert FACS Biopanning (LIFB), including e. g. , FACS, RCA and/or MiC. 

[00072] The invention provides methods and systems for identifying or detecting a molecule (e. g. , a 
biomolecule, a drug, a toxin, a probe, an infectious agent, etc. ) of interest from a population of cells, 
comprising: (a) encapsulating or enclosing at least one cell from a population in a microenvironment, 
wherein the microenvironment is porous, e. g. , allows exchange of aqueous nutrients (e. g. , media), 
hormones, small molecules, growth factors, probes, proteins (e. g. , antibodies, cytokines, etc. ), plasmids 
and/or fosmids, from the exterior to the interior of the microenvironment, or, to diffuse or to flow through 
its interior; (b) placing the microenvironment comprising at least one cell in a containment device; (c) 
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incubating the microenvironment in the containment device under conditions allowing the encapsulated or 
enclosed cell to survive and be maintained, wherein conditions allowing the maintained cell to survive 
comprise exchange of aqueous nutrients (e. g. , media), hormones, small molecules, grov^h factors, probes, 
proteins (e. g. , antibodies, cytokines, etc. ), plasmids and/or fosmids, from the exterior to the interior of the 
microenvironment, or, diffusing or flowing an aqueous nutrient mixture through the containment device; 
and (d) identifying or detecting the biomolecule of interest. The method can further comprise isolating the 
biomolecule of interest. The method can further comprise an incubating step that further comprises 
sufficient time for the enclosed or encapsulated cells to grow or proliferate. The population of cells can be 
from a mixed population of cells or a population of uncultivated cells. The biomolecule of interest can 
comprise a nucleic acid, a protein, a carbohydrate, a lipid or a small molecule. The nucleic acid can 
comprise a genomic DNA (DNA) or an RNA. 

[00073] In one aspect, the identifying step comprises the steps: a) contacting a nucleic acid isolated or 
derived from the enclosed or encapsulated cell with at least one nucleic acid probe comprising a detectable 
label, wherein the nucleic acid probe is capable of specifically hybridizing to a polynucleotide encoding an 
activity of interest; and b) detecting a specific hybridization between a nucleic acid isolated or derived from 
the encapsulated cell and the nucleic acid probe, thereby identifying a polynucleotide encoding an activity 
of interest. The method can further comprise use of Large Insert FACS Biopanning (LIFE) to identify a 
nucleic acid. 

[00074] The molecule of interest (nucleic acid) can be identified by detecting by hybridization with a probe 
having a complementary sequence to a nucleic acid of interest. 

The detecting, identifying or isolating of the molecule of interest can be done by FISH, by 16s rRNA 
hybridization, by using an antibody that specifically binds to the molecule of interest, flow cytometry, 
fluorescence analysis, such as FACS, and the like. 

[00075] In one aspect, the method further comprises isolating a cell comprising the biomolecule of interest. 
The cell can be identified isolated by laser capture microscopy or by laser catapult. 

[00076] In one aspect, the method further comprises the step of visualizing the morphology of a cell. The 
visualizing step can be done by providing a monolayer of microenvironments comprising at least one cell 
and a substrate to support said monolayer for use with a laser capture microscopy system. The visualizing 
can be done by laser capture microdissection device. The laser capture microdissection device can be, e. g. , 
a PALMTM laser, operably linked to a microscope or equivalent. 

[00077] In one aspect, the identified biomolecule of interest comprises a transcript, a gene or a gene 
pathway. 

[00078] In one aspect, the method further comprises analysis of the molecule (e. g., biomolecule, drug, 
toxin) of interest after identifying the biomolecule of interest. A transcript or a gene can be amplified 
before the identifying or analyzing step, e. g. , by PGR or by rolling circle amplification. The method can 
further comprise generating a library from the amplified sequences. The method can further comprise 
sequencing the library. In one aspect, the biomolecule of interest comprises a small molecule, a protein, a 
lipid, a metabolite, a secondary metabolite, a carbohydrate or a nucleic acid. 

[00079] In one aspect, the method further comprises a step (e) isolating a microenvironment comprising at 
least one cell. The method can further comprise isolating a biomolecule from a microenvironment. In one 
aspect, the method further comprises isolating a cell from the isolated microenvironment, followed by 
isolating a biomolecule from the cell. 
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The biomolecule can be secreted by a cell into the microenvironment. In one aspect, the biomolecule is 
detected by hybridization, sequencing, enzymatic reaction or a secondary metabolite assay. 

[00080] The invention provides methods and systems for identifying or detecting a molecule of interest 
from a population of cells, comprising: (a) encapsulating or enclosing at least one cell from the population 
in a microenvironment, wherein the microenvironment is porous, e. g. , allows aqueous nutrients (e. g. , 
media), small molecules, growth factors, hormones, probes, proteins (e. g. , antibodies, cytokines, etc. ) 
and/or plasmids or fosmids, to diffuse, or, to flow through its interior ; (b) placing the microenvironment 
comprising said at least one cell in a containment device, wherein the containment device is fitted or 
configured such that an aqueous nutrient mixture can diffuse or flow or circulate through the containment 
device such that aqueous nutrients diffuse, flow or circulate through the interior of the microenvironment; 
c) incubating the microenvironment comprising said at least one cell in the containment device under 
conditions allowing the encapsulated or enclosed cell to survive and be maintained, thereby isolating and 
maintaining the cell, wherein conditions allowing the maintained cell to survive comprise diffusing or 
flowing an aqueous nutrient mixture through the containment device ; and (d) identifying or detecting a 
microenvironment comprising the biomolecule of interest. 

[00081] The invention provides methods and systems for maintaining a cell from a population of cells 
comprising: (a) a microenvironment encapsulating or enclosing at least a single cell from a population of 
cells, wherein the microenvironment allows fluids, aqueous nutrients (e. g. , media), small molecules, 
growth factors, hormones, probes, proteins (e. g., antibodies, cytokines, etc. ) and/or plasmids or fosmids, 
to diffuse or flow through its interior ; (b) a containment device capable of incubating the 
microenvironment under conditions allowing the cell to survive and be maintained, wherein the 
containment device is fitted or configured such that an aqueous nutrient mixture can diffuse, flow or 
circulate through the containment device such that aqueous nutrients diffuse, flow or circulate through the 
interior of the microenvironment and the desired conditions can be maintained. 

[00082] In one aspect, a system of the invention comprises at least two containers in parallel arrangement 
with an inlet and outlet flow. The system can comprise at least 5,10, 12,18, 24,36, 96 or more containers in 
parallel arrangement to receiving media and retuming waste. The system can comprise at least two or more 
different species of microorganisms are cultured in the system. The system can comprise at least two 
different species are in different containers. The system can comprise at least two different species are 
bacteria and fungi. In one aspect, the system comprises simultaneously maintaining at least two, three, four, 
five or more independently isolated samples from a similar environment. 

[00083] In altemative aspects, microenvironments used in the systems and methods of the invention 
comprise a pore size sufficient to allow a molecule of at least 1 kilodalton (kD) pass through the pore, or, 
the microenvironment comprises a pore size that does not allow a molecule greater than 1 kilodalton (kD) 
to pass through the pore. The microenvironments used in the systems and methods of the invention can 
comprise a pore size that does not allow a molecule greater than about 10,50, 100,150 or 200 kilodalton 
(kD) to pass through the pore. In one aspect, if molecules larger than about 100,150, or 200 kD are desired 
to be CO- incubated with the cells in the microenvironments, these molecules are co-encapsulated with the 
cells. 

[00084] In one aspect, the population of cells comprises a mixed population of cells. The mixed population 
of cells can be uncultivated. The population of cells can be uncultivated. 

The population of cells can be derived from an environmental sample. The environmental sample can be 
derived from geothermal fields, hydrothermal fields, acidic soils, sulfotara mud pots, boiling mud pots, 
pools, hot-springs, geysers, marine actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil, 
temperate soil, arid soil, compost piles, manure piles, marine sediments, freshwater sediments, water 
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concentrates, hypersaline sea ice, super- cooled sea ice, arctic tundra, Sargasso sea, open ocean pelagic, 
marine snow, microbial mats, whale falls, springs, hydrothermal vents, gut microbial communities (e. g. , 
insect and nematode guts), plant endophytes, epiphytic water samples, industrial sites and ex situ 
enrichments. The environmental sample can comprise or be derived from air, water, sediment, soil and 
rock. 

[00085] In one aspect, the population of cells comprises at least one bacterial cell, eukaryote cell, 
prokaryote cell, myxobacteria (epothilone) cell, yeast cell, archaeal cell, plant cell, mammalian cell, insect 
cell or protozoan cell. The population of cells can comprise a mixture of materials, such as a biological 
sample, soil or sludge. The biological sample can comprise a plant sample, a food sample, a gut sample, a 
saUvary sample, a blood sample, a sweat sample, a urine sample, a spinal fluid sample, a tissue sample, a 
vaginal swab, a stool sample, an amniotic fluid sample or a buccal mouthwash sample. 

[00086] In one aspect, the encapsulated or enclosed cell is a microorganism. In one aspect, the cell is a 
bacterial cell, a yeast cell, an archaeal cell, a plant cell, a mammalian cell, an insect cell or a protozoan cell. 
The encapsulated or enclosed cell can be an extremophile, such as hyperthermophiles, psychrophiles, 
halophiles, psychrotrophs, alkalophiles and acidophiles. 

[00087] In one aspect, polypeptides (e. g. , growth factors, cj^okines, antibodies, etc. ) can pass into or out 
of the microenvironment. In one aspect, fluids, e. g. , aqueous fluids, can pass into or'out of the 
microenvironment. In one aspect, nucleic acids, antibodies, hormones, small molecules, or cytokines can 
pass into or out of the microenvironment. In one aspect, plasmids or fosmids can pass into or out of the 
microenvironment. 

[00088] In one aspect, the microenvironment comprises a porous gel, such as a porous gel microdroplet 
(GMD). In one aspect, one or more cells is encapsulated or enclosed in each porous gel microdroplet 
(GMD). The porous gel microdroplet (GMD) can comprise an emulsion matrix or an encapsulation matrix. 
The microenvironment can comprise a hydrogel matrix, a porous membrane or a selectively permeable 
membrane. In one aspect, the microenvironment comprises a microfluidic channel, a liposome or a ghost 
cell, a (per) fluorinated amorphous polymer or a capillary array, such as a GIGAMATRIXTM array. 

In one aspect, the microenvironment comprises a porous chromatographic membrane or a three- 
dimensional porous structure. 

[00089] In one aspect, two, three, four, five, six, seven, eight, nine, ten or more cells are encapsulated or 
enclosed in each microenvironment. 

[00090] In one aspect, the microenvironment comprises a growth column. The growth column can 
comprise a capillary, e. g. , a capillary array, such as a non-addressable high throughput capillary-based 
array in a holding plate, e. g., GIGAMATRIXTM. In one aspect, the growth column comprises a 
chromatography colunm. 

[00091] In one aspect, the conditions for maintaining the cells comprise providing nutrients at in situ 
concentrations. The conditions for maintaining the cells can be equivalent to environmental conditions 
from which the cell was initially derived. The environmental conditions can be the equivalent of a 
geothermal field, a hydrothermal field, an acidic soil, a sulfotara mud pot, a boiling mud pot, a pools, a hot- 
spring, a geyser, a tropical soil, a temperate soil, an arid soil, a compost pile, a manure piles, a marine 
sediment, a freshwater sediment, a water concentrate, a hypersaline sea ice, a super-cooled sea ice, an arctic 
tundra, a fresh water environment, a salt water marine environment, an open ocean pelagic environment, a 
marine snow, a microbial mat, a whale fall, a spring or a hydrothermal vent. 
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[00092] In one aspect, the aqueous nutrient mixture is flowed through the containment device such that the 
microenvironments are suspended in the containment device, circulated in the containment device or 
agitated in the containment device. The containment device can comprise a system comprising an influx 
port and an efflux port for media. In one aspect, the influx port is positioned at the bottom of the system 
and the efflux port is positioned above the influx port. The influx port can be positioned at the bottom of a 
column and the efflux port is positioned at the top of the column. 

[00093] In one aspect, the system comprises a collection device for aqueous nutrient mixture leaving the 
efflux port. The system can recycle aqueous nutrient mixture. The system can filter waste from the aqueous 
nutrient mixture before recycling the media. 

[00094] The invention provides systems for maintaining a cell from a population of cells comprising: (a) a 
plurality of microenvironments, wherein each microenvironment encapsulates or encloses at least a single 
cell from the population of cells, and the microenvironment is configured such that a nutrient or a growth 
factor or a probe can pass through, flow or circulate through its interior; (b) a containment device capable 
of incubating the microenvironments under conditions allowing the cell to survive and be maintained, 
wherein the containment device is fitted or configured such that an aqueous nutrient mixture can be flowed 
or circulated through the containment device such that aqueous nutrients flow or circulate through the 
interior of the microenvironments and the desired conditions can be maintained. 

[00095] The present invention comprises compositions (systems) and methods for high throughput 
screening for biomolecules of interest. In one aspect, the invention provides methods for identifying a 
biomolecule of interest from a population of cells, comprising: (a) encapsulating at least one cell from a 
population in a microenvironment ; (b) placing the microenvironment comprising at least one cell in a 
growth column ; (c) incubating the microenvironment in the growth column xmder conditions allowing the 
encapsulated cell to survive and be maintained, thereby isolating and maintaining the cell ; and (d) isolating 
said cell. 

[00096] In one aspect, the invention provides a method, wherein said identifying step optionally comprises 
the steps: a) contacting a nucleic acid isolated or derived from the encapsulated cell with at least one 
nucleic acid probe comprising a detectable label, wherein the nucleic acid probe is capable of specifically 
hybridizing to a polynucleotide encoding an activity of interest; and b) detecting a specific hybridization 
between a nucleic acid isolated or derived from the encapsulated cell and,the nucleic acid probe, thereby 
identifying a polynucleotide encoding an activity of interest. 

[00097] In one aspect, the invention provides a method, wherein said incubating step further comprises 
sufficient time for said encapsulated cells to proliferate. In another aspect, the cells are maintained. In 
another aspect, the cells are allowed to grow. 

[00098] In one aspect, the invention provides methods for isolating and/or maintaining a cell from a mixed 
population of uncultivated cells comprising: (a) encapsulating in a microenvironment at least a single cell 
from the mixed population ; (b) placing the encapsulated cell in a growth column ; and (c) incubating the 
encapsulated cell in the growth column under conditions allowing the encapsulated cell to survive and be 
maintained, thereby isolating and maintaining the cell. In one aspect, the mixed population of uncultivated 
cells comprises an environmental sample, such as a sample from, or derived from, geothermal fields, 
hydrothermal fields, acidic soils, sulfotara mud pots, boiling mud pots, pools, hot- springs, geysers, marine 
actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil, temperate soil, arid soil, compost 
piles, manure piles, marine sediments, freshwater sediments, water concentrates, hypersaline sea ice, super- 
cooled sea ice, arctic tundra, Sargasso sea, open ocean pelagic, marine snow, microbial mats, whale falls, 
springs, hydrothermal vents, gut microbial communities (e. g, , insect and nematode guts), plant 
endophytes, epiphytic water samples, industrial sites and/or ex situ enrichments. In one aspect, the 
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environmental sample is a eukaryote, prokaryote, myxobacteria (epothilone), and/or isolated from or 
derived from air, water, sediment, soil and/or rock, among others. 

[00099] In one aspect, the mixed population of uncultivated cells, or the single isolated cell, can comprise a 
mixture of materials. The mixture of materials can comprise a biological sample, soil or sludge. In one 
aspect, the biological sample comprises a plant sample, a food sample, a gut sample, a salivary sample, a 
blood sample, a sweat sample, a urine sample, a spinal fluid sample, a tissue sample, a vaginal swab, a 
stool sample, an amniotic fluid sample and/or a buccal mouthwash sample. 

[000100] In one aspect, a cell from a population of cells, may be isolated or cultured from a mixed 
population of cells, and may be uncultivated cells, which may comprise a microorganism, such as a 
bacterial cell, a yeast cell, an archaeal cell, a plant cell, a mammalian cell, an insect cell or a protozoan cell, 
or, a virus or a phage. The cell can comprise any extremophile, such as hyperthermophiles, psychrophiles, 
halophiles, psychrotrophs, alkalophiles, acidophiles and the like. 

[000101] The methods and systems of the invention can incorporate any cell encapsulation strategy. For 
example, depending on the cell population being used, one skilled in the art would understand how to select 
the appropriate encapsulation strategy. Different types of encapsulation strategies and compounds or 
polymers can be used with the present invention. 

For instance, high temperature agaroses can be employed for making microdroplets stable at high 
temperatures, allowing stable encapsulation of cells subsequent to heat kill steps utilized to remove all 
background activities when screening for thermostable bioactivities. 

[000102] In one aspect, the cells are encapsulated in a microenvironment, e. g. , a porous gel, such as a gel 
microdroplet (GMD), e. g., a porous gel microdroplet (GMD), a liposome, a ghost cell, or any equivalent. 
The microenvironment can comprise a hydrogel matrix, or equivalent, a porous membrane, a three- 
dimensional porous structure, or a selectively permeable membrane, or equivalent. In one aspect, the 
porous gel microdroplet (GMD) comprises an emulsion matrix, or equivalent or an encapsulation matrix, or 
equivalent. The three-dimensional porous structure can be as described, e. g. , in U. S. Patent No. 6,627, 
291. In one aspect, the microenvironment comprises a microfluidic channel, as described, e. g. , in U. S. 
Patent No. 6,748, 978. In one aspect, the microenvironment comprises a porous chromatographic 
membrane as described, e. g. , in U. S. Patent No. 6,726, 818. In one aspect, the microenvironment 
comprises a (per) fluorinated amorphous polymer as described, e. g. , in U. S. Patent No. 6,726, 840. 

[000103] In one aspect, the microenvironment comprises a capillary array, such as the GIGAMATRIX, 
Diversa Corporation, San Diego, CA, capillary arrays; and arrays as described in, e. g. , U. S. Patent 
Application No. 20020080350 Al ; WO 0231203 A; WO 0244336 A, and equivalents. In one aspect, the 
capillary array includes a plurality of capillaries formed into an array of adjacent capillaries, wherein each 
capillary comprises at least one wall defining a lumen for retaining at least one cell. The lumen may be 
cylindrical, square, hexagonal or any other geometric shape so long as the walls form a lumen for retention 
of a cell. The capillaries of the capillary array can be held together in close proximity to form a planar 
structure. The capillaries can be bound together, by being fused (e. g. , where the capillaries are made of 
glass), glued, bonded, or clamped side-by-side. 

Additionally, the capillary array can include interstitial material disposed between adjacent capillaries in 
the array, thereby forming a solid planar device containing a plurality of through-holes. 

[000104] In one aspect, one cell is encapsulated in each microenvironment, e. g. , gel microdroplet (GMD), 
liposome, ghost cell, capillary, channel, arid the like, or, one to four cells, or two, three, four, five, six, 
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seven, eight, nine, ten or more cells, can be encapsulated in each microenvironment. There need not be a 
xxniformity of cell distribution in the microenvironments in a containment device; e. g. , a plurality of 
microenvironments can have one cell, another plurality of microenvironments in the same containment 
device can have two, three, four, etc. or more cells. 

[000105] In one aspect, the use of GMDs containing (physically) single cells which can take up nutrients, 
secrete products, and grow to form colonies is useful in the present invention. 

The diffusional properties of GMDs may be made such that sufficient extracellular product remains 
associated with each individual GMD, so as to permit flow cytometric analysis and cell sorting on the basis 
of concentration of secreted molecule within each microdroplet. 

Beads have also been used to isolate mutants growing at different rates, and to analyze antibody secretion 
by hybridoma cells and the nutrient sensitivity of hybridoma cells. 

[000106] In one aspect, gel microdroplet (GMD) technology is used to amplify signals available in flow 
cytometric analysis, and in permitting the screening and sorting of microbial strains in strain improvement 
and isolation programs. GMD or other related technologies can be used in the present invention to localize, 
sort as well as amplify signals in the high throughput screening of recombinant libraries. Cell viability 
during the screening is not an issue or concern since nucleic acid can be recovered from the microdroplet. 

[000107] In one aspect, the growth containment device can be a growth column, which can comprise a 
capillary, such as a capillary array. In another aspect, the capillary array comprises a non-addressable high 
throughput capillary -based array in a holding plate. In another aspect, the non-addressable capillary-based 
array in a holding plate is for example a GIGAMATRIXTM (Diversa Corporation, San Diego, CA) 
technology holding plate. The growth column may comprise a chromatography colunm, or equivalent. 

[000108] In one aspect, conditions allowing the encapsulated cell to survive and be maintained comprise 
providing nutrients at in situ concentrations. In another aspect, the cells proliferate. In a further aspect, the 
cells are grown. The conditions allowing the encapsulated cell to survive and be maintained may comprise 
flowing or circulating (e. g. , including agitating or mixing) an aqueous nutrient mixture (e. g. , a media) 
through the containment device (e. g. , growth column). The flowing is in done in a manner to maintain 
suspension of the discrete microenvironments and wherein is provided in an closed system a first port for 
influx of aqueous nutrient mixture and a second port for efflux of aqueous nutrient mixture, including, e. g. 
, waste and a collection device for said waste. The system can further comprise flowing of aqueous nutrient 
mixture upwards through the containment device, e. g. , as a closed system. 

[000109] In one aspect, the aqueous nutrient mixture (e. g. media) is recycled, e. g. , removed from the 
containment device at one point and retumed to the containment device at another point. A trap, such as a 
screen or filter, can cover an efflux port for aqueous nutrient mixture flowing out of the containment device 
to prevent microenvironments (e. g. , microcapsules) from flowing out of the containment device. The 
aqueous nutrient mixture (e. g. media) can be filtered or modified before recycling. For example, in one 
aspect the phenolic compounds dissolved in aqueous fluid are removed from the aqueous nutrient mixture 
(as described, e. g., in U. S. Patent No. 6,586, 638). The temperature or pH of the aqueous nutrient mixture 
(e. g. media) can be adjusted. Thus, in one aspect, the system of the invention comprises devices or fittings 
for reading and/or adjusting clarity, nutrient concentration, microenvironment concentration or size, 
temperature, pH, hypertonicity or hypotonicity of the aqueous nutrient mixture, and the like. 

[0001 10] In one aspect, incubating and culturing the microenvironment comprising at least one cell 
contained therein in the growth column under conditions allowing growth or proliferation of the cells into a 
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microcolony comprising at least two daughter cells. 

[0001 11] In one aspect, the method further comprises incubating and culturing the encapsulated cell in the 
growth column under conditions allowing growth or proliferation of the cells into a microcolony 
comprising at least two daughter cells. The microcolony can comprise between about 2,3, 4,5, 6,7, 8,9, 10 
and about 100,200, 300 or more cells. 

[0001 12] In one aspect, the method further comprises isolating a gel microdroplet. The method can 
comprise isolating a microcolony from the gel microdroplet. The method can comprise isolating a cell from 
the microcolony. In one aspect, the isolating of a gel microdroplet can comprise sorting an encapsulated 
microcolony by size, e. g. , by using flow cytometry. In one aspect, the gel microdroplet is isolated by 
FACS. 

[0001 13] In one aspect, the method further comprises maintaining the isolated cell by re- encapsulating and 
re-culturing the isolated cell. In one aspect, between about 2,3, 4,5, 6,7, 8, 9,10, 1 1,12, 13, 14, 159 169 17, 
18, 19, 20 and 100 or more cells are maintained in each re- encapsulated microcolony. 

[0001 14] In one aspect, the method further comprises screening the interactions between encapsulated 
cells. In one aspect, the method further comprises re-culturing the isolated gel microdroplet under the same 
or different conditions. In one aspect, the method further comprises direct amplification of nucleic acid 
from the encapsulated cell. In one aspect, the method further comprises direct amplification of nucleic acid 
from the cultivated encapsulated cells. 

[0001 15] The invention also provides methods and systems for identifying a polynucleotide encoding an 
activity of interest comprising encapsulating a cell in a microenvironment from a population of cells, the 
population may be mixed population and may be uncultivated; placing the microenvironment comprising at 
least one encapsulated cell in a growth column; incubating the microenvironment and said at least one cell 
in the growth column under conditions allowing the encapsulated cell to survive and be maintained, 
contacting a nucleic acid isolated or derived from the encapsulated cell with at least one nucleic acid probe 
comprising a detectable label, wherein the nucleic acid probe is capable of specifically hybridizing to a 
polynucleotide encoding an activity of interest; and, detecting a specific hybridization between a nucleic 
acid isolated or derived from the encapsulated cell and the nucleic acid probe, thereby identifying a 
polynucleotide encoding an activity of interest. In one aspect, the method further comprises enriching for a 
polynucleotide encoding an activity of interest by isolating or amplifying the nucleic acid identified by the 
specific hybridization between the nucleic acid isolated or derived from the encapsulated cell and the 
nucleic acid probe. Individual microorganisms may be detected from a mixed population without 
disintegrating the cell structure, using fluorescence in situ hybridization (FISH) techniques. 

[0001 16] In one aspect, fluorescent in situ hybridization (FISH) is used to identify specific sequences from 
chromosomes by the hybridization of a fluorophore-labelled probe with a complementary nucleic acid 
sequence. Direct or indirect detection methods known in the art may be used with this technique. 

[000 11 7] In one aspect, the methods of the invention use ultra-high-throughput (UHTP) screening 
techniques, e. g, , Large Insert FACS Biopanning (LIFB) to identify cells, or microcolonies having cells, 
which contain or express a specific nucleic acid. LIFB is an ultra-high throughput screening technology 
that is based on Microcapsule (MiC) in situ hybridization, Fluorescence Activated Cell Sorting (FACS) 
and Rolling Circle Amplification (RCA), as described in Example 6, below. This approach can be used to 
capture fosmid clones (e. g. , from large insert fosmid libraries) carrying genes of interest. 

[0001 18] The microenvironments used to practice the invention can be"porous, "which, in one aspect 
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means that it is porous for nutrients needed for cells, e. g. , microbes, to grow, or, that probes for 
hybridization can permeate easily into the microenvironment. In one aspect, the microenvironments 
comprise agarose. In one aspect, where it is desired to practice Microcapsule (MiC) in situ hybridization, 
the % of agarose for making MiCs that will stand the harsh denaturation and still let the probe and 
detection reagent in the microenvironment is (for MiC hybridization) between about 2% to 3%, for 
example, 2.5%. However, in other aspects, between about 1% and 10% or more can be used. An exemplary 
protocol of making a microenvironment comprise agarose, particular if it is desired to practice 
Microcapsule (MiC) in situ hybridization, blending is at 2600 rpm. However, one skilled in the art can 
choose any variety of blending speed. This results in different yields and size distribution of MiCs. For a 
2.5% agarose blend, an exemplary protocol uses between about 2600-3000 rpm; however, slower or faster 
speeds can also be used. One exemplary protocol uses a mixture of OneCell agarose (One Cell Systems, 
Inc. , Cambridge, MA) and regular agarose and a different speed to make a gel microdroplet (GMD). The 
agarose type, the oil and the speed determine the final size of the GMDs. Limitations on blending speed is 
dependent on the size of the blender and the size of the microcapsule, which in one aspect may need to be 
suitable for a flow cytometer, e. g. , a FACS. In one aspect, a FACS nozzle is modified to accommodate 
larger microcapsules. 

[0001 19] In one aspect, the separation of single biological objects is done by optical methods, such as by 
the use of optical tweezers, where an object is moving in an aqueous solution. (See K. Schutze, A. 
Clement-Sengwald, Nature 667 (vol 368) (1994) and US Patent No. 5,998, 129, herein incorporated by 
reference in their entirety.) [000120] In another aspect, the methods and systems of the invention use lasers 
for assisting in sorting of biological objects and microstructures, and optical trapping and optical 
microdissection ; these techniques are known in the art. See, e. g. , US Patent No. 5, 998, 129 and WO 
98/14816. 

[000121] In another aspect, the methods and systems of the invention use Laser-Capture Microdissection 
(LCM) to allow the rapid isolation of specific intact cells from complex tissues by direct visualization. 
LCM provides a nondestructive cell sampling technique that has broad application, not only in the area of 
tissue dissection, but genomics and proteomics. 

Current methods use a transparent thermoplastic film applied to a prepared tissue sample and an infi-ared 
laser beam to melt the film over the chosen cells. The film can then be lifted off extracting the chosen cells, 
while leaving the rest behind. Analysis may then be done on the desired cells. See, e. g., Methods in 
Enzymology, Vol 356, Laser Capture Microscopy (2002). 

[000122] In one aspect, nucleic acids or nucleic acid libraries derived fi-om populations of nucleic acids 
and/or organisms are screened very rapidly for bioactivities of interest utilizing liquid phase screening 
methods. The nucleic acids or nucleic acid libraries may be firom mixed populations of cells. Additionally, 
the mixed population may be fi-om uncultivated cells. These libraries can represent the genomes of multiple 
organisms, species or subspecies. 

In one aspect, the libraries are screened via hybridization methods, such as fluorescent in situ hybridization 
(FISH), "biopanning", or by activity -based screening methods. High throughput screening can be 
performed by utilizing single cell screening systems, such as fluorescence activated cell sorting (FACS) or 
by capillary array-based systems. 

[000123] In one aspect, the present invention provides a method for identifying a biomolecule of interest 
fi-om a population of cells, comprising: (a) encapsulating at least one cell fi'om a population in a 
microenvironment ; (b) placing the microenvironment comprising at least one cell in a growth column; (c) 
incubating the microenvironment in the growth column under conditions allowing the encapsulated cell to 
survive and be maintained, thereby isolating and maintaining the cell ; and (d) isolating said cell. 
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[000124] In one aspect of the present invention, the incubating step further comprises sufficient time for 
said encapsulated cells to proliferate. In another aspect, the method of the present invention further 
comprises identifying a biomolecule interest from said microenvironment comprising at least one cell. In 
another aspect, the method further comprises isolating a biomolecule of interest from said 
microenvironment after isolating the cell. 

[000125] The invention provides novel bioactive molecules other than enzymes. In one aspect, antibiotics, 
antivirals, antitumor agents and regulatory proteins, among others are discovered utilizing the methods of 
the present invention. 

^ [000126] The present invention provides methods and compositions to access this imtapped biodiversity 
and to rapidly screen for polynucleotides, proteins, small molecules, and other biomolecules of interest 
utilizing high throughput screening of multiple samples. These biomolecules can be derived from cultured 
or uncultured samples of organisms. In one aspect, the methods of the present invention provide a method 
for high throughput cultivation of unculturable microorganisms. 

[000127] In one aspect, the present invention provides the step of visualizing the morphology of said cell. 
Said visuaUzing step is done by providing a support shaped and designed for use with a laser capture 
microdissection (LCM) system. In one aspect, said visualizing is done by laser capture microdissection, as 
a non-limiting example, is a PALMTM LCM (P. A. L. M Mikrolaser Technologic) laser capture 
microdissection device or other LCM device known in the art. In one aspect, the invention provides a 
method further comprising the step of selecting said cell. In another aspect, the method provides selecting 
by laser catapult. 

[000128] In one aspect, the invention provides a method for identifying a biomolecule of interest from a 
population of cells, comprising: (a) encapsulating at least one cell from the population in a 
microenvironment; (b) placing the microenvironment comprising said at least one cell in a growth column; 
(c) incubating the microenvironment comprising said at least one cell in the growth column xmder 
conditions allowing the encapsulated cell to survive and be maintained, thereby isolating and maintaining 
the cell; and (d) identifying a microenvironment comprising a biomolecule of interest. 

[000129] In one aspect, the invention provides an apparatus for identifying a biomolecule of interest from a 
population of cells comprising: a) a container having at least two ports and a collection device for 
collection of waste products. 

[000130] In one aspect, the present invention provides methods and systems to study molecules which 
affect the interaction of ligands with receptors, e. g, , G proteins with receptors, and the like. 

[000131] In one aspect, the present invention provides methods and systems for identifying clones having a 
specified activity of interest, which process comprises (i) generating one or more gene libraries derived 
from nucleic acid isolated from a mixed population of organisms; and (ii) screening said libraries utilizing a 
high throughput cell analyzer, e. g. , a fluorescence activated cell sorter or a non-optical cell sorter, to 
identify said clones. 

[000132] The invention provides methods and systems for identifying clones having a specified activity of 
interest by (i) generating one or more libraries, e.g., expression libraries, made to contain nucleic acid 
directly or indirectly isolated from a mixed population of organisms; (ii) exposing said libraries to a 
particular substrate or substrates of interest; and (iii) screening said exposed libraries utilizing a high 
throughput cell analyzer, e. g. , a fluorescence activated cell sorter or a non-optical cell sorter, to identify 
clones which react with the substrate or substrates.- 
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[000133] In another aspect, the invention also provides methods and systems for identifying clones having 
a specified activity of interest by (i) generating one or more gene libraries derived from nucleic acid 
directly or indirectly isolated from a mixed population of organisms; and (ii) screening said exposed 
libraries utilizing an assay requiring a binding event or the covalent modification of a target, and a high 
throughput cell analyzer, e. g. , a fluorescence activated cell sorter or non-optical cell sorter, to identify 
positive clones. 

[000134] The invention further provides a method of screening for an agent that modulates the activity of a 
target protein or other cell component (e. g. , nucleic acid), wherein the target and a selectable marker are 
expressed by a recombinant cell, by co-encapsulating the agent in a microenvironment with the 
recombinant cell expressing the target and detectable marker and detecting the effect of the agent on the 
activity of the target cell component. 

[000135] In another aspect, the invention provides methods and systems for enriching for target DNA 
sequences containing at least a partial coding region for at least one specified activity in a DNA sample by 
co-encapsulating a mixture of target DNA obtained from a mixture of organisms with a mixture of DNA 
probes including a detectable marker and at least a portion of a DNA sequence encoding at least one 
enzyme having a specified enzyme activity and a detectable marker ; incubating the co-encapsulated 
mixture under such conditions and for such time as to allow hybridization of complementary sequences and 
screening for the target DNA. Optionally the method further comprises transforming host cells with 
recovered target DNA to produce an expression library of a plurality of clones. 

[000136] The invention further provides methods and systems of screening for an agent that modulates the 
interaction of a first test protein linked to a DNA binding moiety and a second test protein linked to a 
transcriptional activation moiety by co-encapsulating the agent with the first test protein and second test 
protein in a suitable microenvironment and determining the ability of the agent to modulate the interaction 
of the first test protein linked to a DNA binding moiety with the second test protein covalently linked to a 
transcriptional activation moiety, wherein the agent enhances or inhibits the expression of a detectable 
protein. 

[000137] In yet another aspect, the present invention provides methods and systems for identifying a 
polynucleotide in a liquid phase, including contacting a plurality of polynucleotides derived from at least 
one organism, e. g, , a mixed population of organisms, including microorganisms or plant tissue, with at 
least one nucleic acid probe under conditions that allow hybridization of the probe to the polynucleotides 
having complementary sequences, wherein the probe is labeled with a detectable molecule (e. g. , a 
fluorescent, magnetic or other molecule). The detectable molecule changes, e. g. , fluoresces, upon 
interaction of the probe to a target polynucleotide in the library. Clones from the library are then separated 
with an analyzer tiiat detects the change in the detectable molecule, e. g., fluorescence, magnetic field or 
dielectric signature. The detectable molecule may also be a bioluminescent molecule, a chemiluminescent 
molecule, a colorimetric molecule, an electromagnetic molecule, an isotopic molecule, a thermal molecule 
or an enzymatic substrate. The separated clones can be contacted with a reporter system that identifies a 
polynucleotide encoding a polypeptide or a small molecule of interest, for example, and the clones capable 
of modulating expression or activity of the reporter system identified thereby identifying a polynucleotide 
of interest. The liquid phase of the aspect includes in a solution (cell-free), in a cell, or in a non-solid phase. 

[000138] In another aspect, the invention provides methods and systems for identifying a polynucleotide 
encoding a polypeptide of interest. The method includes co-encapsulating in a microenvironment a 
plurality of library clones containing DNA obtained from a mixed population of organisms with a mixture 
of oligonucleotide probes comprising a detectable marker and at least a portion of a polynucleotide 
sequence encoding a polypeptide of interest having a specified bioactivity. The encapsulated clones are 
incubated under such conditions and for such time as to allow interaction of complementary sequences and 
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clones containing a complement to the oligonucleotide probe encoding the polypeptide of interest identified 
by separating clones witii a fluorescent analyzer or non-optical analyzer that detects the detectable marker. 

[000139] In yet another aspect, the invention provides v for high throughput screening of a polynucleotide 
library for a polynucleotide of interest that encodes a molecule of interest. 

The method includes contacting a library containing a plurality of clones comprising polynucleotides 
derived from a mixed population of organisms with a plurality of oligonucleotide probes labeled with a 
detectable molecule wherein said detectable molecule becomes detectable upon interaction of the probe to a 
target polynucleotide in the library; separating clones with an analyzer that detects the detectable marker; 
contacting the separated clones with a reporter system that identifies a polynucleotide encoding the 
molecule of interest; and identifying clones capable of modulating expression or activity of the reporter 
system thereby identifying a polynucleotide of interest. 

[000140] In another aspect, the invention provides methods and systems of screening for a polynucleotide 
encoding an activity of interest. The method includes (a) obtaining polynucleotides from a sample 
containing a mixed population of organisms; (b) normalizing the polynucleotides obtained from the 
sample; (c) generating a library from the normalized polynucleotides; (d) contacting the library with a 
plurality of oligonucleotide probes comprising a detectable marker and at least a portion of a 
polynucleotide sequence encoding a polypeptide of interest having a specified activity to select library 
clones positive for a sequence of interest; (e) selecting clones with an analyzer (e. g. a fluorescent or non- 
optical analyzer) that detects the marker; (f) contacting the selected clones with a reporter system that 
identifies a polynucleotide encoding the activity of interest; and (g) identifying clones capable of 
modulating expression or activity of the reporter system thereby identifying a polynucleotide of interest9 
wherein the positive clones contain a polynucleotide sequence encoding an activity of interest which is 
capable of catalyzing the bioactive substrate. 

[000141] In yet another aspect, the present invention provides methods and systems for screening 
polynucleotides, comprising contacting a library of polynucleotides derived from a mixed population of 
organism with a probe oligonucleotide labeled with a detectable molecule, which is detectable upon 
binding of the probe to a target polynucleotide of the library, to select library polynucleotides positive for a 
sequence of interest; separating library members that are positive for the sequence of interest with an 
analyzer that detects the molecule; expressing the selected polynucleotides to obtain polypeptides; 
contacting the polypeptides with a reporter system; and identifying polynucleotides encoding polypeptides 
capable of modulating expression or activity of the reporter system. 

[000142] In another aspect, the invention provides methods and systems for obtaining an organism from a 
mixed population of organisms in a sample. The method includes encapsulating in a microenvironment at 
least one organism from the sample; incubating the encapsulated organism under such conditions and for 
such a time to allow the at least one microorganism to grow or proliferate; and sorting the encapsulated 
organism by flow cytometry to obtain an organism from the sample. 

[000143] In another aspect, the invention provides methods and systems for identifying a polynucleotide in 
a hquid phase comprising: a) contacting a plurality of polynucleotides derived from at least one organism 
with at least one nucleic acid probe under conditions that allow hybridization of the probe to the 
polynucleotides having complementary sequences, wherein the probe is labeled with a detectable molecule; 
and b) identifying a polynucleotide of interest with an analyzer that detects the detectable molecule. 

[000144] In one aspect, the methods and systems use a sample screening apparatus including a plurality of 
capillaries formed into an array of adjacent capillaries, wherein each capillary comprises at least one wall 
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defining a lumen for retaining a sample. The apparatus further includes interstitial material disposed 
between adjacent capillaries in the array, and one or more reference indicia formed within of the interstitial 
material. 

[0001451 In one aspect, the methods and systems use a capillary for screening a sample, wherein the 
capillary is adapted for being bound in an array of capillaries, includes a first wall defining a lumen for 
retaining the sample, and a second wall formed of a filtering material, for filtering excitation energy 
provided to the lumen to excite the sample. 

[000146] According to yet another aspect of the invention, methods and systems for incubating a 
bioactivity or biomolecule of interest include the steps of introducing a first component into at least a 
portion of a capillary of a capillary array, wherein each capillary of the capillary array comprises at least 
one wall defining a lumen for retaining the first component, and introducing an air bubble into the capillary 
behind the first component. The method fiuther includes the step of introducing a second component into 
the capillary, wherein the second component is separated fi-om the first component by the air bubble. 

[000147] In one aspect, the invention provides methods and systems of incubating a sample of interest that 
include introducing a first Uquid labeled with a detectable particle into a capillary of a capillary array, 
wherein each capillary of the capillary array comprises at least one wall defining a lumen for retaining the 
first liquid and the detectable particle, and wherein the at least one wall is coated with a binding material 
for binding the detectable particle to the at least one wall. The method fiirther includes removing the first 
liquid fi-om the capillary tube, wherein the bound detectable particle is maintained within the capillary, and 
introducing a second liquid into the capillary tube. 

[000148] Another aspect of the invention includes a recovery apparatus for a sample screening system, 
wherein the system includes a plurality of capillaries formed into an array. 

The recovery apparatus includes a recovery tool adapted to contact at least one capillary of the capillary 
array and recover a sample fi-om the at least one capillary. The recovery apparatus fiirther includes an 
ejector, connected with the recovery tool, for ejecting the recovered sample fi"om the recovery tool. 

[000149] The invention provides a universal and novel methods and systems that provide access to this 
immense reservoir of untapped microbial diversity. This technique combines compartmentalized 
microcolonies with flow cytometry for massively parallel microbial cultivation. The invention provides the 
ability to grow and study these organisms in pure culture. It revolutionizes our understanding of microbial 
physiology and metabolic adaptation and provides new sources of novel microbial metabolites. The 
invention can be applied to samples firom several different environments, including seawater, sediments, 
and soil. 

[000150] The details of one or more embodiments of the invention are set forth in the accompanying 
drawings and the description below. Other features, objects, and advantages of the invention will be 
apparent from the description and drawings, and fi-om the claims. 

[000151] All publications mentioned herein are incorporated herein by reference in fiiU for the purpose of 
describing and disclosing the databases, proteins, and methodologies, which are described in the 
pubhcations which might be used in connection with the presently described invention. The publications 
discussed above and throughout the text are provided solely for their disclosure prior to the filing date of 
the present apphcation. Nothing herein is to be construed as an admission that the inventors are not entitled 
to antedate such disclosure by virtue of prior invention. 
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[000152] All publications, patents, patent applications, GenBank sequences and ATCC deposits, cited 
herein are hereby expressly incorporated by reference for all purposes. 

BRIEF DESCRIPTION OF THE FIGURES The following drawings are illustrative of embodiments of the 
invention and are not meant to limit the scope of the invention as encompassed by the claims. 

Figure 1 illustrates the protocol used in the cell sorting method of the invention to screen for a 
polynucleotide of interest, in this case using a (library excised into E. coli). The clones of interest are 
isolated by sorting. 

Figure 2 shows a microtiter plate where clones or cells are sorted in accordance with the invention. 
Typically one cell or cells grown within a microdroplet are dispersed per -well and grown up as clones. 

Figure 3 depicts a co-encapsulation assay. Cells containing library clones are co- encapsulated with a 
substrate or labeled oUgonucleotide. Encapsulation can occur in a variety of means, including GMDs, 
liposomes, and ghost cells. Cells are screened via high throughput screening on a fluorescence analyzer. 

Figure 4 depicts a side scatter versus forward scatter graph of FACS sorted gel- microdroplets (GMDs) 
containing a species of Streptomyces which forms unicells. Empty gel- microdroplets are distinguished 
from free cells and debris, also. 

Figure 5 is a depiction of a FACS/Biopanning method described herein and described in Example 3, below. 

Figure 6A shows an example of dimensions of a capillary array of the invention. 

Figure 6B illustrates an array of capillary arrays. 

Figure 7 shows a top cross-sectional view of a capillary array. 

Figure 8 is a schematic depicting the excitation of and emission from a sample within the capillary lumen 
according to one aspect of the invention. 

Figure 9 is a schematic depicting the filtering of excitation and emission light to and from a sample within 
the capillary lumen according to an alternative aspect of the invention. 

Figure 10 illustrates an aspect of the invention in which a capillary array is wicked by contacting a sample 
containing cells, and humidified in a humidified incubator followed by imaging and recovery of cells in the 
capillary array. 

Figure 1 1 illustrates a method for incubating a sample in a capillary tube by an evaporative and capillary 
wicking cycle. 

Figure 12A shows a portion of a surface of a capillary array on which condensation has formed. Figure 12B 
shows the portion of the surface of the capillary array, depicted in Figure 12 A, in which the surface is 
coated with a hydrophobic layer to inhibit condensation near an end of individual capillaries. 

Figures 13 A, 13B and 13C depict a method of retaining at least two components within a capillary. 

Figure 14A depicts capillary tubes containing paramagnetic beads and cells. Figure 14B depicts the use of 
the paramagnetic beads to stir a sample in a capillary tube. 
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Figure 15 depicts an excitation apparatus for a detection system according to an. aspect of the invention. 

Figure 16 illustrates a system for screening samples using a capillary array according to an aspect of the 
invention. 

Figure 1 7A illustrates one example of a recovery technique useful for recovering a sample from a capillary 
array. In this depiction a needle is contacted with a capillary containing a sample to be obtained. A vacuum 
is created to evacuate the sample from the capillary tube and onto a filter. Figure 17B illustrates one sample 
recovery method in which the recovery device has an outer diameter greater th£in the inner diameter of the 
capillary from which a sample is being recovered. Figure 1 7C illustrates another sample recovery method 
in which the recovery device has an outer diameter approximately equal to or less than the inner diameter 
of the capillary. Figure 17D shows the further processing of the sample once evacuated from the capillary. 

Figure 18 is a schematic showing high throughput enrichment of low copy gene targets. 

Figure 19 is a schematic of FACS-Biopanning using high throughput culturing. 

Polyketide synthase sequences from environmental samples are shown in the alignment. 

Figure 20 shows whole cell hybridization for biopanning. 

Figure 21 is a schematic showing co-encapsulation of a eukaryotic cell and a bacterial cell. 
Figure 22 illustrates a whole cell hybridization schematic for biopanning and FACS sorting. 
Figure 23 shows a schematic of T7 RNA Polymerase Expression system. 

Figure 24 is a schematic summarizing an exemplary protocol to determine the optimal growth medium for 
a broad diversity of organisms, as described in detail in Example 18, below. 

Figure 25 is an illustration of a light scattering signature of microcolonies as detected and separated by 
flow cytometry, as described in detail in Example 18, below. 

Figures 26a, 26b and 26c are schematic drawings summarizing the characterization of clones 
(microcolonies) from organisms found and isolated by a method of the invention and analyzed by 16S 
rRNA gene sequence analysis, as described in detail in Example 18, below. Figure 26d is an illustration of 
a picture of a culture designated as strain GM1DJE10E6, as described in detail in Example 18, below. 

Like reference symbols in the various drawings indicate like elements. 

DETAILED DESCRIPTION OF THE INVENTION [000153] The invention provides a novel high 
throughput cultivation methods and systems based on the combination of single cell encapsulation 
procedures with flow cytometry that enables cells to proliferate, grow or be maintained with nutrients that 
are present at environmental concentrations. 

[000154] The present invention provides methods and systems for rapid sorting, isolation, detection and 
screening of libraries derived from a mixed population of organisms from, for example, an environmental 
sample or an uncultivated population of organisms. In one-aspect, gene libraries are generated, clones are 
either exposed to a substrate or substrate (s) of interest, or hybridized to a fluorescence labeled probe 
having a sequence corresponding to a sequence of interest and positive clones are identified and isolated 
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via fluorescence activated cell sorting. Cells can be viable or non-viable during the process or at the end of 
the process, as nucleic acids encoding a positive activity can be isolated and cloned utilizing techniques 
well known in the art. 

[000155] This invention differs from fluorescence activated cell sorting, as normally performed, in several 
aspects. Previously, FACS machines have been employed in studies focused on the analyses of eukaryotic 
and prokaryotic cell lines and cell culture processes. 

FACS has also been utilized to monitor production of foreign proteins in both eukaryotes and prokaryotes 
to study, for example, differential gene expression. The detection and counting capabilities of the FACS 
system have been applied in these examples. However, FACS has never previously been employed in a 
discovery process to screen for and recover bioactivities in prokaryotes. In addition, non-optical methods 
have not been used to identify or discover novel bioactivities or biomolecules. Furthermore, the present 
invention does not require cells to survive, as do previously described technologies, since the desired 
nucleic acid (recombinant clones) can be obtained from alive or dead cells. For example, the cells only 
need to be viable long enough to contain, carry or synthesize a complementary nucleic acid sequence to be 
detected, and can thereafter be either viable or non-viable cells so long as the complementary sequence 
remains intact. The present invention also solves problems that would have been associated with detection 
and sorting of E. coli expressing recombinant enzymes, and recovering encoding nucleic acids. The 
invention includes within its aspects apparatus capable of detecting a molecule or marker that is indicative 
of a bioactivity or biomolecule of interest, including optical and non-optical apparatus. 

[000156] In one aspect, the present invention includes within its aspects any apparatus capable of detecting 
fluorescent wavelengths associated with biological material, such apparatuses are defined herein as 
fluorescent analyzers (one example of which is a FACS apparatus). 

[0001571 In the methods and systems of the invention, use of a culture-independent approach to directly 
clone genes encoding novel enzymes from, for example, an environmental sample containing a mixed 
population of organisms allows one to access untapped resources of biodiversity. In one aspect, the 
invention is based on the construction of mixed population libraries"which represent the collective 
genomes of naturally occurring organisms archived in cloning vectors that can be propagated in suitable 
prokaryotic hosts. 

Because the cloned DNA is initially extracted directly from environmental samples, the libraries are not 
limited to the small fraction of prokaryotes that can be grown in pure culture. 

Additionally, a normalization of the DNA present in these samples could allow more equal representation 
of the DNA from all of the species present in the original sample. This can increase the efficiency of 
finding interesting genes from minor constituents of the sample which may be under-represented by several 
orders of magnitude compared to the dominant species. 

[000158] Prior to the present invention, the evaluation of complex mixed population expression libraries 
was rate limiting. The present invention allows the rapid screening of complex mixed population libraries, 
containing, for example, genes from thousands of different organisms. The benefits of the present invention 
can be seen, for example, in screening a complex mixed population sample. Screening of a complex sample 
previously required one to use labor intensive methods to screen several million clones to cover the 
genomic biodiversity. The invention represents an extremely high-throughput screening method which 
allows one to assess this enormous number of clones. The method disclosed herein allows the screening 
anywhere from about 30 million to about 200 million clones per hour for a desired nucleic acid sequence or 
biological activity. This allows the thorough screening of mixed population libraries for clones expressing 



file://C:\My%20Documents\WIPO\WO-05-0 1 0 1 69.html 



9/14/06 



Page 33 of 123 



novel biomolecules. 

[000159] The invention provides methods and compositions whereby one can screen, sort or identify a 
polynucleotide sequence, polypeptide, or molecule of interest from a mixed population of organisms (e. g. , 
organisms present in a mixed population sample) based on polynucleotide sequences present in the sample. 
Thus, the invention provides methods and compositions useful in screening organisms for a desired 
biological activity or biological sequence and to assist in obtaining sequences of interest that can further be 
used in directed evolution, molecular biology, biotechnology and industrial applications. By screening and 
identifying the nucleic acid sequences present in the sample, the invention increases the repertoire of 
available sequences that can be used for the development of diagnostics, therapeutics or molecules for 
industrial applications. Accordingly, the methods of the invention can identify novel nucleic acid sequences 
encoding proteins or polypeptides having a desired biological activity. 

[000160] In one aspect, the invention provides methods and systems for high throughput culturing of 
organisms. In one aspect, the organisms are a mixed population of organisms. 

In another aspect, the organisms include host cells of a library containing nucleic acids. For example, such 
libraries include nucleic acid obtained from various isolates of organisms, which are then pooled; nucleic 
acid obtained from isolate libraries, which are then pooled; or nucleic acids derived directly from a mixed 
population of organisms. Generally, a sample containing the organisms is mixed with a composition that 
can form a microenvironment, as described herein, e. g. , a gel microdroplet or a liposome. In one aspect, as 
illustrated in Example 8 a mixed population of microorganisms is mixed with the encapsulation material in 
such a way that preferably fewer than 5 microorganisms are encapsulated. Preferably, only one 
microorganism is encapsulated in each microenvironment system. 

[000161] Once encapsulated, the cells are cultured in a manner which allows growth of the organisms, e. g. 
, host cells of a library. For example, Example 8 provides growth of the encapsulated organisms in a 
chromatography column which allows a flow of growth medium providing nutrients for growth and for 
removal of waste products from cells. Over a period of time (20 minutes to several weeks or months), a 
clonal population of the preferably one organism grows within the microenvironment. 

[000162] After a desired period of time, microenvironments, such as a gel microdroplets, can be sorted to 
eliminate"empty"microenvironments and to sort for the occupied microenvironments. The nucleic acid 
from organisms in the sorted microenvironments can be studied directly, for example, by treating with a 
PGR mixture and amplified immediately after sorting. In one Example described herein, 16S rRNA genes 
from individual cells were studied and organisms assessed for phylogenetic diversity from the samples. 

[000163] In another aspect, the high throughput culturing methods of the invention allow culturing of 
organisms and enrichment of low copy gene targets. For example, a library of nucleic acid obtained from 
various isolates of organisms, which are then pooled; nucleic acid obtained from isolate libraries, which are 
then pooled; or nucleic acids derived directly from a mixed population of organisms, for example, are 
encapsulated, such as in a gel microdroplet or other suitable microenvironment, and grown under 
conditions which allow clonal expansion of each organism in the microenvironment. In one aspect, the 
cells of the clonal population are lysed and treated with proteinases to yield nucleic acid (see Figures) (e. g. 
, the microcolonies are de-proteinized by incubating gel microdroplets in lysis solution containing 
proteinase K at 37 degrees G for 30 minutes). In order to denature and neutralize nucleic acid entrapped in 
the microenvironments, they are denatured with alkaline denaturing solution (0. 5M NaOH) and neutralized 
(e. g. , with Tris pH8). In one particular example, nucleic acid entrapped in the microenvironment is 
hybridized with Digoxiginin (DIG)-labeled oHgonucleotides (30-50 nt) in Dig Easy Hyb (available from 
Roche) overnight at 37 degrees C, followed by washing with 0. 3xSSG and O. IxSSG at 38-50 degrees G to 
achieve desired stringency. One of skill in the art will appreciate that this is merely an example and not 
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meant to limit the invention in any way. For example, other labels commonly used in the art, e. g. , 
fluorescent labels such as GFP or chemiluminescent labels, can be utilized in the invention methods. In one 
aspect, gel microdroplet technology is used to practice the methods and systems of the invention. It can be 
used to amplify the signals available in flow cytometric analysis, and in permitting the screening of 
microbial strains in strain improvement programs for biotechnology. See, e. g. , Wittrup et aL , 
(Biotechnolo. Bioeng. 

(1993) 42: 351-356), describing a micro-encapsulation selection method which allows the rapid and 
quantitative screening of cells (e. g. , in his example, >106 yeast cells for enhanced secretion of 
Aspergillus awaiiiori glucoamylase). This technique can provide a 400-fold single-pass enrichment for 
high-secretion mutants. 

[000164] Gel microdroplet or other related technologies can be used in the present invention to localize as 
well as amplify signals in the high throughput screening of recombinant libraries. Cell viability during the , 
screening is not an issue or concem since nucleic acid can be recovered from the microdroplet. 

[000165 1 In one aspect, the nucleic acid is hybridized with a probe which can be labeled. A signal can be 
amplified with a secondary label (e. g. , fluorescent) and the nucleic acid sorted for fluorescent 
microenvironments, e. g. , gel microdroplets. Nucleic acid that is fluorescent can be isolated and further 
studied or cloned into a host cell for further manipulation. In one particular example, signals are amplified 
with Tyramide Signal Amplification (TSA) kit from Molecular Probe. TSA is an enzyme-mediated signal 
amplification method that utilizes horseradish peroxidase (HRP) to depose fluorogenic tyramide molecules 
and generate high-density labeling of a target nucleic acid sequence in situ. The signal amplification is 
conferred by the tumover of multiple tyramide substrates per HRP molecule, and increases in signal 
strength of over 1, 000-fold have been reported. The procedure involves incubating GMDs with anti-DIG 
conjugated horseradish peroxidase (anti- DIG-HRP) (Roche, IN) for 3 hours at room temperature. Then the 
tyramide substrate solution will be added and incubated for 30 minutes at room temperature (RT). 

[000166] In one embodiment, fluorescence in situ hybridization (FISH) is used for the detection of single 
microorganisms out of a mixed population of cells without disintegrating the cell structure. Fluorescently 
labeled or biotinylated oligonucleotides are stringently hybridized to their respective specific binding site 
(ribosomal or messenger RNA) and detected by epifluorescence microscopy (allows visualization of target 
organisms), fluorescence activated cell sorting (FACS), or other techniques known in the art suitable for 
detecting labeled biomolecules or any combination thereof. This technique may be applied prior to 
isolating an identified target molecule. 

[000167] In one aspect, this high throughput culturing method followed by sorting (e. g., FACS) screening 
(e. g. , biopaiming), allows for identification of gene targets. It may be desirable to screen for nucleic acids 
encoding virtually any protein or any bioactivity and to compare such nucleic acids among various species 
of organisms in a sample (e. g. , study polyketide sequences from a mixed population). In another aspect, 
nucleic acid derived from high throughput culturing of organisms can be obtained for further study or for 
generation of a library. Such nucleic acid can be pooled and a library created, or altematively, individual 
libraries from clonal populations of organisms can be generated and then nucleic acid pooled from those 
libraries to generate a more complex library. The libraries generated as described herein can be utilized for 
the discovery of biomolecules (e. g. , nucleic acid or bioactivities) or for evolving nucleic acid molecules 
identified by the high throughput culturing methods described in the present invention. 

[000168] The methods and systems of the invention can use evolution methods known in the art or 
described herein, such as, shuffling, cassette mutagenesis, recursive ensemble mutagenesis, sexual PGR, 
directed evolution, exonuclease-mediated reassembly, codon site- saturation mutagenesis, amino acid site- 
saturation mutagenesis, gene site saturation mutagenesis, introduction of mutations by non-stochastic 
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polynucleotide reassembly methods, synthetic ligation polynucleotide reassembly, gene reassembly, 
oligonucleotide- directed saturation mutagenesis, in vivo reassortment of polynucleotide sequences having 
partial homology, naturally occvuring recombination processes which reduce sequence complexity, and any 
combination thereof. 

[000169] Flow cytometry has been used in cloning and selection of variants from existing cell clones. This 
selection, however, has required stains that diffuse through cells passively, rapidly and irreversibly, with no 
toxic effects or other influences on metabolic or physiological processes. Since, typically, flow sorting has 
been used to study animal cell culture performance, physiological state of cells, and the cell cycle, one goal 
of cell sorting has been to keep the cells viable during and after sorting. 

[000170] There currently are no reports in the literature of screening and discovery of polynucleotide 
sequence in libraries by cell sorting based on fluorescence (e. g. fluorescent activated cell sorting), or non- 
optical markers (e. g. , magnetic fields and the like). 

Furthermore there are no reports of recovering DNA encoding bioactivities screened by FACS or non- 
optical techniques and additionally screening for a bioactivity of interest. The present invention provides 
these methods to allow the extremely rapid screening of viable or non- viable cells to recover desirable 
activities and the nucleic acid encoding those activities. 

[000171] Different types of encapsulation (e. g. , gel microdroplet) strategies and compoimds or polymers 
can be used with the present invention. One skilled in the art would understand that the type of 
encapsulation strategy used will depend on the cell population to be encapsulated. For instance, high 
temperature agaroses can be employed for making microdroplets stable at high temperatures, allowing 
stable encapsulation of cells subsequent to heat-kill steps utilized to remove all backgroimd activities when 
screening for thermostable bioactivities. Encapsulation can be in beads, high temperature agaroses, gel 
microdroplets, cells, such as ghost red blood cells or macrophages, liposomes, or any other means of 
encapsulating and localizing molecules. For example, methods of preparing liposomes have been described 
(i. e. , U. S. Patent No. 's 5, 653, 996,5393530 and 5,651, 981), as well as the use of liposomes to 
encapsulate a variety of molecules U. S. Patent No. ^s 5,595, 756,5, 605,703, 5,627, 159,5, 652,225, 5,567, 
433,4, 235,871, 5,227, 170). Entrapment of proteins, viruses, bacteria and DNA in erythroc>1:es during 
endocytosis has been described, as well (Joumal of Applied Biochemistry 4,418-435 (1982) ). Erythrocytes 
employed as carriers in vitro or in vivo for substances entrapped during hypo-osmotic lysis or dielectric 
breakdown of the membrane have also been described (reviewed in Ihler, G. M. (1983) J. Pharm. Ther). 

Although a variety of encapsulation techniques have been described, they are to serve as non- limiting 
examples of techniques known in the art and one skilled in the art would understand that the encapsulation 
technique will depend on the population of cells used. These techniques are useful in the present invention 
to encapsulate samples for screening. 

[000172] "Microenvironment", as used herein, is any molecular structure which provides an appropriate 
environment for facilitating the interactions necessary for the method of the invention. An environment 
suitable for facilitating molecular interactions include, for example, gel microdroplets, ghost cells, 
macrophages or liposomes, among others. In one aspect, a microenvironment further mimics in situ 
conditions of the environment from which the cell population originated, such as from sea water or soil, 
allowing the culturing of the cells within the microenvironment. 

[000173] In a preferred embodiment, a microenvironment is shaped and designed to allow media and/or 
nutrients and amendments to flow into and out of the microenvironment. 
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Additionally, waste may be removed from the microenvironment by the flow of the media and/or nutrients 
and media. 

[000174] As a non-limiting example, liposomes can be prepared from a variety of lipids including 
phospholipids, glycolipids, steroids, long-chain aUcyl esters; e. g., alkyl phosphates, fatty acid esters; e. g. , 
lecithin, fatty amines and tihe like. A mixture of fatty material may be employed such a combination of 
neutral steroid, a charge amphiphile and a phospholipid. 

Illustrative examples of phospholipids include lecithin, sphingomyelin and dipalmitoylphos- 
phatidylcholine. Representative steroids include cholesterol, cholestanol and lanosterol. 

Representative charged amphiphilic compounds generally contain from 12-30 carbon atoms. 

Mono-or dialkyl phosphate esters, or alkyl amines; e. g., diacyl phosphate, stearyl amine, hexadecyl amine, 
dilauryl phosphate, and the like. One skilled in the art would understand that other environments may be 
prepared for use with specified cell populations. 

[000175] The invention methods include methods and systems for holding and screening samples. 
According to one aspect of the invention, a sample screening apparatus includes a plurality of capillaries 
formed into an array of adjacent capillaries, wherein each capillary comprises at least one wall defining a 
lumen for retaining a sample. The apparatus fixrther includes interstitial material disposed between adjacent 
capillaries in the array, and one or more reference indicia formed within of the interstitial material, (see co- 
pending U. S. patent applications serial nos. 09/687,219 and 09/894,956). 

[000176] According to another aspect of the invention, a capillary for screening a sample, wherein the 
capillary is adapted for being boxmd in an array of capillaries, includes a first wall defining a lumen for 
retaining the sample, and a second wall formed of a filtering material, for filtering excitation energy 
provided to the lumen to excite the sample. 

[000177] In another aspect of the invention, methods and systems for incubating a bioactivity or 
biomolecule of interest comprises the steps of introducing a first component into at least a portion of a 
capillary of a capillary array, wherein each capillary of the capillary array comprises at least one wall 
defining a lumen for retaining the first component, and introducing an air bubble into the capillary behind 
the first component. The method further includes the step of introducing a second component into the 
capillary, wherein the second component is separated from the first component by the air bubble. 

[000178] In one aspect of the invention, methods and systems of incubating a sample of interest includes 
introducing a first liquid labeled with a detectable particle into a capillary of a capillary array, wherein each 
capillary of the capillary array comprises at least one wall defining a lumen for retaining the first liquid and 
the detectable particle, and wherein the at least one wall is coated with a binding material for binding the 
detectable particle to the at least one wall. The method fiuther includes removing the first liquid from the 
capillary tube, wherein the bound detectable particle is maintained within the capillary, and introducing a 
second liquid into the capillary tube. 

[000179] Another aspect of the invention includes a recovery apparatus for a sample screening system, 
wherein the system includes a plurality of capillaries formed into an array. 

The recovery apparatus includes a recovery tool adapted to contact at least one capillary of the capillary 
array and recover a sample from the at least one capillary. The recovery apparatus fiirther includes an 
ejector, connected with the recovery tool, for ejecting the recovered sample from the recovery tool. 
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Definitions [000180] Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as commonly understood to one of ordinary skill in the art to which the invention belongs. 
Although any methods, devices and materials similar or equivalent to those described herein can be used in 
the practice or testing of the invention, the methods, devices and materials are now described. 

[000181] As used herein and in the appended claims, the singular forms"a, ""and,"and '*the'4nclude plural 
referents unless the context clearly dictates otherwise. Thus, for example, reference to"a clone"includes a 
plurality of clones and reference to"the nucleic acid sequence"generally includes reference to one or more 
nucleic acid sequences and equivalents thereof known to those skilled in the art, and so forth. 

[000182] An"amino acid"is a molecule having the structure wherein a central carbon atom (the (3-carbon 
atom) is linked to a hydrogen atom, a carboxylic acid group (the carbon atom of which is referred to herein 
as a"carboxyl carbon atom"), an amino group (the nitrogen atom of which is referred to herein as an"amino 
nitrogen atom"), and a side chain group, R. 

When incorporated into a peptide, polypeptide, or protein, an amino acid loses one or more atoms of its 
amino acid carboxylic groups in the dehydration reaction that links one amino acid to einother. As a result, 
when incorporated into a protein, an amino acid is referred to as an"amino acid residue." [000183] 
"Biomolecule"as used herein refers to any biological molecule, or equivalent, including any biological 
molecule contained in a cell or produced by a cell, or equivalent. 

Examples include, but are not limited to polypeptides, proteins, lipids, metabolites, secondary metabolites, 
antibodies, gene products, carbohydrates, small molecules, amongst others. 

[00014]"Protein"or"polypeptide"refers to any polymer of two or more individual amino acids (whether or 
not naturally occurring) linked via a peptide bond, and occurs when the carboxyl carbon atom of the 
carboxylic acid group bonded to the (3-carbon of one amino acid (or amino acid residue) becomes 
covalently bound to the amino nitrogen atom of amino group bonded to the P-carbon of an adjacent amino 
acid. The term"protein"is understood to include the terms"polypeptide"and"peptide" (which, at times may 
be used interchangeably herein) within its meaning. In addition, proteins comprising multiple polypeptide 
subunits (e. g. , DNA polymerase III, RNA polymerase II) or other components (for example, an ELNA 
molecule, as occurs in telomerase) will also be understood to be included within the meaning of 'protein" as 
used herein. Similarly, fragments of proteins and polypeptides are also within the scope of the invention 
and may be referred to herein as"proteins.""Protein"or "polypeptide"also refers to any synthetic equivalent, 
e. g. , a peptidomimetic. 

[000185] A particular amino acid sequence of a given protein (i. e. , the polypeptide's "primary structure, 
"when written from the amino-terminus to carboxy-terminus) is determined by the nucleotide sequence of 
the coding portion of a mRNA, which is in tum specified by genetic information, typically genomic DNA 
(including organelle DNA, e. g., mitochondrial or chloroplast DNA). Thus, determining the sequence of a 
gene assists in predicting the primary sequence of a corresponding polypeptide and more particular the role 
or activity of the polypeptide or proteins encoded by that gene or polynucleotide sequence. 

[000186] The term"isolated"means altered"by the hand of man"fi'om its natural state; i. e., if it occurs in 
nature, it has been changed or removed fi-om its original environment, or both. 

For example, a naturally occurring polynucleotide or a polypeptide naturally present in a living animal, a 
biological sample or an environmental sample in its natural state is not "isolated", but the same 
polynucleotide or polypeptide separated from the coexisting materials of its natural state is" isolated", as the 
term is employed herein. Such polynucleotides, when introduced into host cells in culture or in whole 
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organisms, still would be isolated, as the term is used herein, because they would not be in their naturally 
occurring form or environment. Similarly, the polynucleotides and polypeptides may occur in a 
composition, such as a media formulation (solutions for introduction of polynucleotides or polypeptides, 
for example, into cells or compositions or solutions for chemical or enzjnuatic reactions). 

[000187]"Polynucleotide"or"nucleic acid sequence"refers to a polymeric form of nucleotides. In some 
instances a polynucleotide refers to a sequence that is not immediately contiguous with either of the coding 
sequences with which it is immediately contiguous (one on the 5*end and one on the 3'end) in the naturally 
occurring genome of the organism from which it is derived. The term therefore includes, for example, a 
recombinant DNA which is incorporated into a vector; into an autonomously replicating plasmid or vims; 
or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e. g. , a 
cDNA) independent of other sequences. The nucleotides of the invention can be ribonucleotides, deoxy- 
ribonucleotides, or modified forms of either nucleotide. A polynucleotides as used herein refers to, among 
others, single-and double-stranded DNA, DNA that is a mixture of single-and double-stranded regions, 
single-and double-stranded RNA, and RNA that is mixture of single-and double-stranded regions, hybrid 
molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a 
mixture of single-and double-stranded regions. In addition, polynucleotide as used herein refers to triple- 
stranded regions comprising RNA or DNA or both RNA and DNA. 

The strands in such regions may be from the same molecule or from different molecules. The regions may 
include all of one or more of the molecules, but more typically involve only a region of some of the 
molecules. One of the molecules of a triple-helical region often is an oligonucleotide. The term 
polynucleotide encompasses genomic DNA or RNA (depending upon the organism, i. e. , RNA genome of 
vimses), as well as mRNA encoded by the genomic DNA, and cDNA. 

[000188] By rapidly screening for polynucleotides encoding polypeptides of interest, the invention 
provides not only a source of materials for the development of biologies, therapeutics, and enzymes for 
industrial applications, but also provides a new materials for further processing by, for example, directed 
evolution and mutagenesis to develop molecules or polypeptides modified for particular activity or 
conditions. 

[000189] The invention is used to obtain, identify, and isolate polynucleotides and related sequence 
specific information from, for example, infectious microorganisms present in the environment such as, for 
example, in the gut of various macroorganisms. 

[000190] In another aspect, the methods and compositions of the invention provide for the identification of 
lead drug compounds present in an environmental sample. The methods of the invention provide the ability 
to mine the environment for novel drugs or identify related dmgs contained in different microorganisms. 
There are several common sources of lead compounds (drug candidates), including natural product 
collections, synthetic chemical collections, and synthetic combinatorial chemical libraries, such as 
nucleotides, peptides, or other polymeric molecules that have been identified or developed as a result of 
environmental mining. Each of these sources has advantages and disadvantages. The success of programs 
to screen these candidates depends largely on the number of compounds entering the programs, and 
pharmaceutical companies have to date screened hundred of thousands of synthetic and natural compounds 
in search of lead compoimds. Unfortunately, the ratio of novel to previously-discovered compounds has 
diminished with time. The discovery rate of novel lead compounds has not kept pace with demand despite 
the best efforts of pharmaceutical companies. There exists a strong need for accessing new sources of 
potential dmg candidates. Accordingly, the invention provides a rapid and efficient method to identify and 
characterize environmental samples that may contain novel drug compounds. 

[000191] The invention provides methods of identifying a nucleic acid sequence encoding a polypeptide 
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having either known or unknown function. For example, much of the diversity in microbial genomes 
results from the rearrangement of gene clusters in the genome of microorganisms. These gene clusters can 
be present across species or phylogenetically related with other organisms. The compositions (systems) and 
method (s) of the present invention facilitate the rapid discovery of gene clusters in gene expression 
libraries. The compositions and methods of the present invention facilitate the rapid discovery of genes, 
gene pathways and gene clusters, particularly polyketide synthase genes, polyketide synthase gene 
pathways and polyketides, from gene expression libraries. 

[000192] For example, bacteria and many eukaryotes have a coordinated mechanism for regulating genes 
whose products are involved in related processes. The genes are clustered, in structures referred to as"gene 
clusters, "on a single chromosome and are transcribed together under the control of a single regulatory 
sequence, including a single promoter which initiates transcription of the entire cluster. The gene cluster, 
the promoter, and additional sequences that function in regulation altogether are referred to as 
an"operon"and can include up to 20 or more genes, usually from 2 to 6 genes. Thus, a gene cluster is a 
group of adjacent genes that are either identical or related, usually as to their function. Gene clusters are 
generally 15 kb to greater than 120 kb in length. 

[000193] Some gene families consist of identical members. Clustering is a prerequisite for maintaining 
identity between genes, although clustered genes are not necessarily identical. 

Gene clusters range from extremes where a duplication is generated to adjacent related genes to cases 
where hundreds of identical genes lie in a tandem array. Sometimes no significance is discemable in a 
repetition of a particular gene. A principal example of this is the expressed duplicate insulin genes in some 
species, whereas a single insulin gene is adequate in other mammalian species. 

[000194] Further, gene clusters undergo continual reorganization and, thus, the ability to create 
heterogeneous libraries of gene clusters from, for example, bacterial or other prokaryote sources is valuable 
in determining sources of novel proteins, particularly including enzymes such as, for example, the 
polyketide synthases that are responsible for the synthesis of polyketides having a vast array of useful 
activities. Other types of proteins that are the product (s) of gene clusters are also contemplated, including, 
for example, antibiotics, antivirals, antitumor agents and regulatory proteins, such as insulin. 

[000195] As an example, polyketide synthases enzymes fall in a gene cluster. Polyketides are molecules 
which are an extremely rich source of bioactivities, including antibiotics (such as tetracyclines and 
erythromycin), anti-cancer agents (daunomycin), immunosuppressants (FK506 and rapamycin), and 
veterinary products (monensin). Many polyketides (produced by polyketide synthases) are valuable as 
therapeutic agents. Polyketide synthases are multifunctional en2ymes that catalyze the biosynthesis of a 
huge variety of carbon chains differing in length and pattems of functionality and cyclization. Polyketide 
synthase genes fall into gene clusters and at least one type (designated type I) of polyketide synthases have 
large size genes and enzymes, complicating genetic manipulation and in vitro studies of these 
genes/proteins. 

[000196] The ability to select and combine desired components from a library of polyketides and 
postpolyketide biosynthesis genes for generation of novel polyketides for study is appealing. The method 
(s) of the present invention make it possible to, and facilitate the cloning of, novel polyketide synthases, 
since one can generate gene banks with clones containing large inserts (especially when using the f-factor 
based vectors), which facilitates cloning of gene clusters. 

[000197] Other biosynthetic genes include NRPS, glycosyl transferases and p450s. For example, a gene 
cluster can be ligated into a vector containing an expression regulatory sequences which can control and 
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regulate the production of a detectable protein or protein- related array activity from the ligated gene 
clusters. Use of vectors which have an exceptionally large capacity for exogenous nucleic acid introduction 
are particularly appropriate for use with such gene clusters and are described by way of example herein to 
include artificial chromosome vectors, cosmids, and the f-factor (or fertility factor) of E. coli. 

For example, the f-factor of E. coli is a plasmid which affects high-frequency transfer of itself during 
conjugation and is ideal to achieve and stably propagate large nucleic acid fragments, such as gene clusters 
from samples of mixed populations of organisms. 

[000198] The nucleic acid isolated or derived from these samples (e. g. , a mixed population of 
microorganisms) can preferably be inserted into a vector or a plasmid prior to screening of the 
polynucleotides. Such vectors or plasmids are typically those containing expression regulatory sequences, 
including promoters, enhancers and the like. 

[000199] In one aspect, the invention provides novel methods and systems to clone and screen mixed 
populations of organisms present from mixed and/or uncultured populations, for example, in environmental 
samples, for polypeptides, polynucleotides or other biomolecules of interest, enzymatic activities and 
bioactivities of interest in vitro. The method (s) and systems of the invention allow the cloning and 
discovery of novel bioactive molecules in vitro, and in particular novel bioactive molecules derived from 
uncultivated or cultivated samples. Large size gene clusters, genes and gene fragments can be cloned, 
sequenced and screened using the method (s) of the invention. Unlike previous strategies, the method (s) of 
the invention allow one to clone, screen and identify polynucleotides and the polypeptides encoded by 
these polynucleotides in vitro from a wide range of mixed population samples. 

.[000200] The invention allows one to screen for and identify polynucleotide sequences from complex 
mixed population samples. DNA libraries obtained from these samples can be created from cell free 
samples, so long as the sample contains nucleic acid sequences, or from samples containing cellular 
organisms or viral particles. The organisms from which the libraries may be prepared include prokaryotic 
microorganisms, such as Eubacteria and Archaebacteria, lower eukaryotic microorganisms such as fungi, 
algae and protozoa, as well as plants, plant spores and pollen. The organisms may be cultured organisms or 
uncultured organisms obtained from mixed population environmental samples, including extremophiles, 
such as thermophiles, hyperthermophiles, psychrophiles and psychrotrophs. 

[000201] Sources of nucleic acids used to construct a DNA library may be obtained from mixed population 
samples, such as, but not limited to, microbial samples obtained from Arctic and Antarctic ice, water or 
'permafrost sources, materials of volcanic origin, materials from soil or plant sources, i. e. , from tropical 
areas, among others, droppings from various organisms including mammals, invertebrates, as well as dead 
and decaying matter, among others. Thus, for example, nucleic acids may be recovered from either a 
cultured or non- cultured organism and used to produce an appropriate DNA library (e. g. , a recombinant 
expression library) for subsequent determination of the identity of the particular polynucleotide sequence or 
screening for bioactivity. 

[000202] The following outlines a general procedure for producing libraries from both culturable and non- 
culturable organisms as well as mixed population of organisms, which libraries can be probed, sequenced 
or screened to select therefrom nucleic acid sequences having an identified, desired or predicted biological 
activity (e. g. , an enzymatic activity or a small molecule). 

[000203] As used herein a"mixed population" sample is any sample containing organisms or 
polynucleotides or a combination thereof, which can be obtained from any number of sources (as described 
above), including but not limited to, for example, insect feces, soil, water, etc. Any source of nucleic acids 
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in purified or non-purified form can be utilized as starting material. Thus, the nucleic acids may be obtained 
from any source which is contaminated by an organism or from any sample containing cells. The mixed 
population sample can be an extract from any bodily sample such as blood, urine, spinal fluid, tissue, 
vaginal swab, stool, amniotic fluid or buccal mouthwash from any mammalian organism. For non- 
mammalian (e. g. , invertebrates) organisms the sample can be a tissue sample, salivary sample, fecal 
material or material in the digestive tract of the organism. An environmental sample also includes samples 
obtained from extreme environments including, but not limited to, hot sulfiir pools, volcanic vents, and 
frozen tundra. In addition, the sample can come from a variety of sources. For example, in horticulture and 
agricultural testing the sample can be a plant, fertilizer, soil, liquid or other horticultural or agricultural 
product; in food testing the sample can be fresh food or processed food (for example infant formula, 
seafood, fresh produce and packaged food) ; and in environmental testing the sample can be liquid, soil, 
sewage treatment, sludge and any other sample in the environment which is considered or suspected of 
containing an organism or polynucleotides. 

[000204] When the sample is a mixture of material (e. g. , a mixed population of organisms), including but 
not limited to, for example, blood, soil and sludge, it can be treated with an appropriate reagent which is 
effective to open the cells and expose or separate the strands of nucleic acids. Mixed populations can 
comprise pools of cultured organisms or samples. For example, samples of organisms can be cultured prior 
to analysis in order to purify a particular population and thus obtaining a purer sample. Organisms, such as 
actinomycetes or myxobacteria, among others, are known to produce bioactivities of interest can be 
enriched for, via culturing. Culturing of organisms in the sample may include culturing the organisms in 
microdroplets and separating the cultured microdroplets with a cell sorter into individual wells of a multi- 
well tissue culture plate from which fiirther processing may be performed. 

[000205] The sample may comprise nucleic acids from, for example, a diverse and mixed population of 
organisms (e. g. , microorganisms present in the gut of an insect, among others). 

Nucleic acids are isolated from the sample using any number of methods known in the art for DNA and 
RNA isolation. Such nucleic acid isolation methods are commonly performed in the art. Where the nucleic 
acid is RNA, the RNA may be reversed transcribed to DNA using primers known in the art. Where the 
DNA is genomic DNA, the DNA can be sheared manually, mechanically, or chemically. Although other 
techniques are known and may be used in the method of tiie present invention, one example of shearing 
DNA, is for example, using a 25 gauge needle. 

[000206] The nucleic acids can be cloned into a vector. Cloning techniques are known in the art or may be 
developed by one skilled in the art, without undue experimentation. Vectors used in the present invention 
include, but are not limited to: plasmids, phages, cosmids, phagemids, viruses (e. g. , retroviruses, 
parainfluenzavirus, herpesviruses, reoviruses, paramj^oviruses, and the like), artificial chromosomes, or 
selected portions thereof (e. g., coat protein, spike glycoprotein, capsid protein). For example, cosmids and 
phagemids are typically used where the specific nucleic acid sequence to be analyzed or modified is large 
because these vectors are able to stably propagate large polynucleotides. One skilled in the art would 
understand that any vector may be chosen and used in the method of the present invention and is dependent 
upon the nucleic acid to be analyzed or modified. 

[000207] The vector containing the cloned DNA sequence can then be amplified by plating (i. e. , clonal 
amplification) or transfecting a suitable host cell with the vector (e. g. , a phage on an E. coli host). 
Altematively (or subsequently to amplification), the cloned DNA sequence is-may be used to prepare a 
library for screening by transforming a suitable organism. Hosts, known in the art are transformed by 
artificial introduction of the vectors containing the target nucleic acid by inoculation under conditions 
conducive for such transformation. One could transform with double stranded circular or linear nucleic acid 
or there may also be instances where one would transform with single stranded circular or linear nucleic 
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acid sequences. By transform or transformation, as used herein is meant a permanent or transient genetic 
change induced in a cell following incorporation of new DNA (i. e. , DNA exogenous to the cell). 

Where the cell is a mammalian cell, a permanent genetic change is generally achieved by introduction of 
the DNA into the genome of the cell. A transformed cell or host cell generally refers to a cell (e. g. , 
prokaryotic or eukaryotic) into which (or into an ancestor of which) has been introduced, by means of 
recombinant DNA techniques, a DNA molecule not normally present in the host organism. 

[000208] A particular type of vector for use in the present invention contains an f-factor origin replication. 
The f-factor (or fertility factor) in E. coli is a plasmid which effects high frequency transfer of itself during 
conjugation and less frequent transfer of the bacterial chromosome itself. In a particular aspect cloning 
vectors referred to as"fosmids"or bacterial artificial chromosome (BAC) vectors are used. These are 
derived from E. coli f-factor which is able to stably integrate large segments of DNA. When integrated with 
DNA from a mixed uncultured mixed population sample, this makes it possible to achieve large genomic 
fragments in the form of a stable"mixed population nucleic acid library." [000209] The nucleic acids 
derived from a mixed population or sample may be inserted into the vector by a variety of procedures. In 
general, the nucleic acid sequence is inserted into an appropriate restriction endonuclease site (s) by 
procedures known in the art. Such procedures and others are deemed to be within the scope of those skilled 
in the art. A typical cloning scenario may have the DNA"blunted"with an appropriate nuclease (e. g., Mimg 
Bean Nuclease), methylated with, for example, EcoRI Methylase and ligated to EcoRl linkers. 

The linkers are then digested with an EcoRI Restriction Endonuclease and the DNA size fractionated (e. g. 
, using a sucrose gradient). The resulting size fractionated DNA is then ligated into a suitable vector for 
sequencing, screening or expression (e. g. , a lambda vector and packaged using an in vitro lambda 
packaging extract). 

[000210] Transformation of a host cell with recombinant DNA may be carried out by conventional 
techniques as are well known to those skilled in the art. Where the host is prokaryotic, such as E. coli, 
competent cells which are capable of DNA uptake can be prepared from cells harvested after exponential 
growth phase and subsequently treated by the CaC12 method by procedures well known in the art. 
Altematively, MgCI2 or RbCI can be used. Transformation can also be performed after forming a 
protoplast of the host cell or by electroporation. Transformation of Pseudomonas fluorescens and yeast host 
cells can be achieved by electroporation, using techniques described herein. 

[00021 1] When the host is a eukaryote, methods of transfection or transformation with DNA include 
conjugation, calcium phosphate co-precipitates, conventional mechanical procedures such as 
microinjection, electroporation, insertion of a plasmid encased in liposomes, or virus vectors, as well as 
others known in the art, may be used. Eukaryotic cells can also be cotransfected with a second foreign 
DNA molecule encoding a selectable marker, such as the herpes simplex thymidine kinase gene. Another 
method is to use a eukaryotic viral vector, such as simian virus 40 (SV40) or bovine papilloma virus, to 
transiently infect or transform eukaryotic cells and express the protein. (Eiikaryotic Viral Vectors, Cold 
Spring Harbor Laboratory, Gluzman ed. , 1982). The eukaryotic cell may be a yeast cell (e. g., 
Saccharomyces cerevisiae), an insect cell (e. g. , Drosophila sp. ) or may be a mammalian cell, including a 
human cell. 

[000212] Eukaryotic systems, and mammalian expression systems, allow for post- translational 
modifications of expressed mammalian proteins to occur. Eukaryotic cells which possess the cellular 
machinery for processing of the primary transcript, glycosylation, phosphorylation, and, advantageously 
secretion of the gene product should be used. Such host cell lines may include, but are not limited to, CHO, 
B Q, BE, HeLa, COS, XCK, Jurkat, HEK-293, and WI38. 
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[0002131 After the gene libraries have been generated one may perform"biopanning"of the libraries prior 
to expression screening. The"biopanning"procediire refers to a process for identifying clones having a 
specified biological activity by screening for sequence homology in the library of clones, using at least one 
probe DNA comprising at least a portion of a DNA sequence encoding a polypeptide having the specified 
biological activity ; and detecting interactions with the probe DNA to a substantially complementary 
sequence in a clone. 

Clones (either viable or non-viable) are then separated by an analyzer (e. g. , a FACS apparatus or an 
apparatus that detects non-optical markers). 

[000214] The probe DNA used to probe for the target DNA of interest contained in clones prepared from 
polynucleotides in a mixed population of organisms can be a full-length coding region sequence or a partial 
coding region sequence of DNA for a known bioactivity. 

The sequence of the probe can be generated by synthetic or recombinant means and can be based upon 
computer based sequencing programs or biological sequences present in a clone. 

The DNA library can be probed using mixtures of probes comprising at least a portion of the DNA 
sequence encoding a known bioactivity having a desired activity. These probes or probe libraries are 
preferably single-stranded. The probes that are particularly suitable are those derived from DNA encoding 
bioactivities having an activity similar or identical to the specified bioactivity which is to be screened. 

[000215] In another aspect, a nucleic acid library from a mixed population of organisms is screened for a 
sequence of interest by transfecting a host cell containing the library with at least one labeled nucleic acid 
sequence which is all or a portion of a DNA sequence encoding a bioactivity having a desirable activity and 
separating the library clones containing the desirable sequence by optical-or non-optical-based analysis. 

[000216] In another aspect, in vivo biopanning may be performed utilizing a FACS-based machine. 
Complex gene libraries are constructed with vectors which contain elements which stabilize transcribed 
RNA. For example, the inclusion of sequences which result in secondary structures such as hairpins which 
are designed to flank the transcribed regions of the RNA would serve to enhance their stability, thus 
increasing their half Ufe within the cell. 

The probe molecules used in the biopanning process consist of oligonucleotides labeled with reporter 
molecules that only fluoresce upon binding of the probe to a target molecule. 

Various dyes or stains well known in the art, for example those described in'Tractical Flow Cytometry", 
1995 Wiley -Liss, Inc., Howard M. Shapiro, M. D. , can be used to intercalate or associate with nucleic acid 
in order to"label"the oligonucleotides. These probes are introduced into the recombinant cells of the library 
using one of several transformation methods. The probe molecules interact or hybridize to the transcribed 
target mRNA or DNA resulting in DNA/RNA heteroduplex molecules or DNA/DNA duplex molecules. 
Binding of the probe to a target will yield a fluorescent signal which is detected and sorted by the FACS 
machine during the screening process. 

[000217] The probe DNA can be at least about 10 bases, or, at least 15 bases. Other size ranges for probe 
DNA are at least about 15 bases to about 100 bases, at least about 100 bases to about 500 bases, at least 
about .500 bases to about 1,000 bases, at least about 1,000 bases to about 5,000 bases and at least about 5, 
000 bases to about 10,000 bases. In one aspect, an entire coding region of one part of a pathway may be 
employed as a probe. Where the probe is hybridized to the target DNA in an in vitro system, conditions for 
the hybridization in which target DNA is selectively isolated by the use of at least one DNA probe will be 
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designed to provide a hybridization stringency of at least about 50% sequence identity, more particularly a 
stringency providing for a sequence identity of at least about 70%. 

Hybridization techniques for probing a microbial DNA library to isolate target DNA of potential interest 
are well known in the art and any of those which are described in the literature are suitable for use herein. 
Prior to fluorescence sorting the clones may be viable or non-viable. For example, in one aspect, the cells 
are fixed with paraformaldehyde prior to sorting. 

[000218] Once viable or non-viable clones containing a sequence substantially complementary to the probe 
DNA are separated by a fluorescence analyzer, polynucleotides present in the separated clones may be 
further manipulated. In some instances, it may be desirable to perform an amplification of the target DNA 
that has been isolated. In this aspect, the target DNA is separated from the probe DNA after isolation. In 
one aspect, the clone can be grown to expand the clonal population. Alternatively, the host cell is lysed and 
the target DNA amplified. It is then amplified before being used to transform a new host (e. g., subcloning). 
Long PGR (Barnes, W M, Proc. Natl. Acad. Sci, USA, Mar. 15, 1994) can be used to amplify large DNA 
fragments (e. g., 35 kb). Numerous amplification methodologies are now well known in the art. 

[000219] Where the target DNA is identified in vitro, the selected DNA is then used for preparing a library 
for further processing and screening by transforming a suitable organism. 

Hosts can be transformed by artificial introduction of a vector containing a target DNA by inoculation 
under conditions conducive for such transformation. 

[000220] The resultant libraries (enriched for a polynucleotide of interest) can then be screened for clones 
which display an activity of interest. Clones can be shuttled in altemative hosts for expression of active 
compounds, or screened using methods described herein. 

[000221] Having prepared a multipUcity of clones from DNA selectively isolated via hybridization 
technologies described herein, such clones are screened for a specific activity to identify clones having a 
specified characteristic. 

[000222] The screening for activity may be effected on individual expression clones or may be initially 
effected on a mixture of expression clones to ascertain whether or not the mixture has one or more 
specified activities. If the mixture has a specified activity, then the individual clones may be re-screened for 
such activity or for a more specific activity. 

[000223] Prior to, subsequent to or as an altemative to the in vivo biopanning described above is an 
encapsulation technique such as GMDs, which may be employed to localize at least one clone in one 
location for growth or screening by a fluorescent analyzer (e. g. FACS). 

The separated at least one clone contained in the GMD may then be cultured to expand the number of 
clones or screened on a FACS machine to identify clones containing a sequence of interest as described 
above, which can then be broken out into individual clones to be screened again on a FAGS machine to 
identify positive individual clones. Screening in this manner using a FACS machine is described in patent 
application Ser. No. 08/876,276, filed Jxme 16,1997. Thus, for example, if a clone has a desirable activity, 
then the individual clones may be recovered and re-screened utilizing a FACS machine to determine which 
of such clones has the specified desirable activity. 

[000224] Further, it is possible to combine some or all of the above aspects such that a normalization step 
is performed prior to generation of the expression library, the expression library is then generated, the 
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expression library so generated is then biopanned, and the biopanned expression hbrary is then screened 
using a high throughput cell sorting and screening instrument. Thus there are a variety of options, 
including: (i) generating the Hbrary and then screening it; (ii) normalize the target DNA, generate the 
expression library and screen it; (iii) normalize, generate the library, biopan and screen; or (iv) generate, 
biopan and screen the library. 

[000225] The library may, for example, be screened for a specified enzyme activity. For example, the 
enzyme activity screened for may be one or more of the six lUB classes; oxidoreductases, transferases, 
hydrolases, lyases, isomerases and ligases. The recombinant enzymes which are determined to be positive 
for one or more of the lUB classes may then be re-screened for a more specific enzyme activity. 

[000226] Alternatively, the library may be screened for a more specialized enzyme activity. 

For example, instead of generically screening for hydrolase activity, the library may be screened for a more 
specialized activity, i. e. the type of bond on which the hydrolase acts. 

Thus, for example, the library may be screened to ascertain those hydrolases which act on one or more 
specified chemical fimctionalities, such as: (a) amide (peptide bonds), i. e. proteases; (b) ester bonds, i. e. 
esterases and lipases; (c) acetals, i. e. , glycosidases etc. 

[000227] As described with respect to one of the above aspects, the invention provides processes and 
systems for activity screening of clones containing selected DNA derived fi*om a either a single cell, a 
population of cells or a mixed population of organisms cells or more than one organism. 

[000228] Biopanning polynucleotides from a mixed population of organisms by separating the clones or 
polynucleotides positive for sequence of interest with a fluorescent analyzer that detects fluorescence, to 
select polynucleotides or clones containing polynucleotides positive for a sequence of interest, and 
screening the selected clones or polynucleotides for specified bioactivity. In one aspect, the polynucleotides 
are contained in clones having been prepared by recovering DNA of a microorganism, which DNA is 
selected by hybridization to at least one DNA sequence which is all or a portion of a DNA sequence 
encoding a bioactivity having a desirable activity. 

[000229] In another aspect, a DNA library derived from a microorganism is subjected to a selection 
procedure to select therefrom DNA which hybridizes to one or more probe DNA sequences which is all or 
a portion of a DNA sequence encoding an activity having a desirable activity by contacting a DNA library 
with a fluorescent labeled DNA probe imder conditions permissive of hybridization so as to produce a 
double-stranded complex of probe and members of the DNA library. 

[000230] The present invention offers the ability to screen for many types of bioactivities. 

For instance, the ability to select and combine desired components from a library of polyketides and 
postpolyketide biosynfliesis genes for generation of novel polyketides for study is appealing. The method 
(s) of the present invention make it possible to and facilitate the cloning of novel polyketide synthase genes 
and/or gene pathways, and other relevant pathways or genes encoding commercially relevant secondary 
metabolites, since one can generate gene banks with clones containing large inserts (especially when using 
vectors which can accept large inserts, such as the f-factor based vectors), which facilitates cloning of gene 
clusters. 

[000231] The biopanning approach described above can be used to create libraries enriched with clones 
carrying sequences substantially homologous to a given probe sequence. Using this approach, libraries 
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containing clones with inserts of up to 40 kbp or larger can be enriched approximately 1,000 fold after each 
round of panning. This enables one to reduce the number of clones to be screened after 1 roimd of 
biopanning enrichment. This approach can be applied to create libraries enriched for clones carrying 
sequence of interest related to a bioactivity of interest, for example, polyketide sequences. 

[000232] Hybridization screening using high density filters or biopanning has proven an efficient approach 
to detect homologues of pathways containing genes of interest to discover novel bioactive molecules that 
may have no known counterparts. Once a polynucleotide of interest is enriched in a library of clones it may 
be desirable to screen for an activity. For example, it may be desirable to screen for the expression of small 
molecule ring structures or "backbones". Because the genes encoding these poly cyclic structures can often 
be expressed in E. coli, the small molecule backbone can be manufactured, even if in an inactive form. 

Bioactivity is conferred upon transferring the molecule or pathway to an appropriate host that expresses the 
requisite glycosylation and methylation genes that can modify or"decorate"the structure to its active form. 
Thus, even if inactive ring compounds, recombinantly expressed in E. coli are detected to identify clones 
which are then shuttled to a metabolically rich host, such as Streptomyces (e. g. , Streptomyces diversae or 
venezuelae) for subsequent production of the bioactive molecule. It should be understood that E. coli can 
produce active small molecules and in certain instances it may be desirable to shuttle clones to a 
metabolically rich host for"decoration"of the structure, but not required. The use of high throughput robotic 
systems allows the screening of hundreds of thousands of clones in multiplexed arrays in microtiter dishes. 

[000233] One approach to detect and enrich for clones carrying these structures is to use FACS screening, a 
procedure described and exemplified in U. S. Ser. No. 08/876, 276, filed June 16,1997. Polycyclic ring 
compounds typically have characteristic fluorescent spectra when excited by ultraviolet light. Thus, clones 
expressing these structures can be distinguished from background using a sufficiently sensitive detection 
method. High throughput FACS screening can be utilized to screen for small molecule backbones in, for 
example, E. coli libraries. Commercially available FACS machines are capable of screening up to 100,000 
clones per second for UV active molecules. These clones can be sorted for further FACS screening or the 
resident plasmids can be extracted and shuttled to Streptomyces for activity screening. 

[000234] In another aspect, a bioactivity or biomolecule or compound is detected by using various 
electromagnetic detection devices, including, for example, optical, magnetic and thermal detection 
associated with a flow cytometer. Flow cytometer typically use an optical method of detection 
(fluorescence, scatter, and the like) to discriminate individual cells or particles from within a large 
population. There are several non-optical technologies that could be used alone or in conjimction with the 
optical methods to enable new discrimination/screening paradigms. 

[000235] In another aspect, a bioactivity or biomolecule or compound is detected using Fluorescence in 
situ hybridization (FISH), which allows the detection of single microorganisms out of a mixed population 
without disintegrating the cell structure. 

Fluorescently labeled or biotinylated oligonucleotides are stringently hybridized to their respective specific 
binding site (ribosomal or messenger RNA), and may be detected by epifluorescence microscopy,' which 
allows visualization of target organisms, or fluorescence activated cell sorting (high-throughput) may be 
done before detection, and any combination thereof. 

[000236] Magnetic field sensing is one such techniques that can be used as an alternative or in conjunction 
with, for example, fluorescence based methods. Hall-Effect Sensors are one example of sensors that can be 
employed. Superconducting Quantum Interference Devices ("SQUIDS") are the most sensitive sensors for 
magnetic flux and magnetic fields, so far developed. A standardized criterion for the sensitivity of a 
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SQUID is its energy resolution. 

This is defined as the smallest change in energy that the SQUID can detect in one second (or in a 
bandwidth of 1 Hz). Typical values are 10-33 J/Hz. The utility of SQUIDS can be found in the presence of 
magnetosomes in certain types of bacterial that contain chains of permanent single magnetic domain 
particles of magnetite (FE304) of gregite (Fe3S4). The magnetic field (or residual magnetic field) of a cell 
that contains a magnetosome is detected by positioning a SQUID in close proximity to the flow stream of a 
flow cytometer. Using this method cells or cells containing, for example, magnetic probes can be isolated 
based on their magnetic properties. As another example, changes in the synthetic pathway of magnetosome 
containing bacteria can be measured using a similar technique. Such techniques can be used to identify 
agents which modulate the synthetic pathway of magnetosomes. 

[000237] Measuring dynamic charge properties is another technique that can may be used as an alternative 
or in conjunction with, for example, fluorescence based methods. Multiple Coupling Spectroscopy 
("MCS") directly measures the dynamic charge properties of systems without the need for labeling. 
Structural changes that occur when molecules interact results in representative changes in charge 
distribution, and these produce a dielectric based spectra or" signature" that reveals the affinity, specificity 
and functionality of each interaction. 

Similar changes in charge distribution occur in cellular systems. By observing the changes in these 
signatures, the dynamics of molecular pathways and cellular function can be resolved in their native 
conditions. MCS utilizes a small microwave (500 MHz to 50 GHz) transceiver that could be positioned in 
close proximity to the flow stream of a flow cytometer. Because of the short measurement times (e. g., 
microseconds) required, a complete MCS signature for each cell within the stream of a flow cytometer can 
be generated and analyzed. Certain cells can then be sorted and/or isolated based on either spectral features 
which are known a priori or based on some statistical variation from a general population. Examples of 
uses for this technique include selection of expression mutants, small molecule pre-screening, and the like. 

[000238] In one screening approach, biomolecules from candidate clones can be tested for bioactivity by 
susceptibility screening against test organisms such as Staphylococcus aureus. Micrococcus luteus, E. coli, 
or Saccharomyces cerevisiae. FACS screening can be used in this approach by co-encapsulating clones 
with the test organism. 

[000239] An alternative to the above-mentioned screening methods provided by the present invention is an 
approach termed"mixed extract"screening. The"mixed extract" screening approach takes advantage of the 
fact that the accessory genes needed to confer activity upon the poly cyclic backbones are expressed in 
metabolically rich hosts, such as Streptomyces, and that the enzymes can be extracted and combined with 
the backbones extracted from E. coli clones to produce the bioactive compound in vitro. Enzyme extract 
preparations from metabolically rich hosts, such as Streptomyces strains, at various growth stages are 
combined with pools of organic extracts from E. coli libraries and then evaluated for bioactivity. 

Another approach to detect activity in the E. coli clones is to screen for genes that can convert bioactive 
compounds to different forms. For example, a recombinant enzynie was recently discovered that can 
convert the low value daunomycin to the higher value doxorubicin. 

Similar enzyme pathways are being sought to convert penicillins to cephalosporins. 

[000240] Screening may be carried out to detect a specified enzyme activity by procedures known in the 
art. For example, enzyme activity may be screened for one or more of the six TUB classes; oxidoreductases, 
transferases, hydrolases, lyases, isomerases and ligases. The recombinant enzymes which are determined to 
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be positive for one or more of the lUB classes may then be re-screened for a more specific enzyme activity. 
Alternatively, the library may be screened for a more specialized enzyme activity. For example, instead of 
generically screening for hydrolase activity, the library may be screened for a more specialized activity, i. 
e. the type of bond on which the hydrolase acts. Thus, for example, the library may be screened to ascertain 
those hydrolases which act on one or more specified chemical fimctionalities, such as: (a) amide (peptide 
bonds), i. e. proteases ; (b) ester bonds, i. e. esterases and lipases; (c) acetals, i. e., glycosidases. 

[000241] FACS screening can . also be used to detect expression of UV fluorescent molecules in any host, 
including metabolically rich hosts, such as Streptomyces. For example, recombinant oxytetracylin retains 
its diagnostic red fluorescence when produced heterologously in S. lividans TK24. Pathway clones, which 
can be sorted by FACS, can thus be screened for polycyclic molecules in a high throughput fashion. 

[000242] Recombinant bioactive compounds can also be screened in vivo using"two- hybrid" systems, 
which can detect enhancers and inhibitors of protein-protein or other interactions such as those between 
transcription factors and their activators, or receptors and their cognate targets. In this aspect, both the small 
molecule pathway and the reporter construct are co-expressed. Clones altered in reporter expression can 
then be sorted by FACS and the pathway clone isolated for characterization. 

[000243] As indicated, common approaches to drug discovery involve screening assays in which disease 
targets (macromolecules implicated in causing a disease) are exposed to potential drug candidates which 
are tested for therapeutic activity. In other approaches, whole cells or organisms that are representative of 
the causative agent of the disease, such as bacteria or tumor cell lines, are exposed to the potential 
candidates for screening purposes. 

Any of these approaches can be employed with the present invention. 

[000244] The present invention also allows for the transfer of cloned pathways derived from uncultivated 
samples into metabolically rich hosts for heterologous expression and downstream screening for bioactive 
compounds of interest using a variety of screening approaches briefly described above. 

[000245] Recovering Desirable Bioactivities [000246] In one aspect, after viable or non-viable cells, each 
containing a different expression clone fi*om the gene library is screened, and positive clones are recovered, 
DNA can be isolated fi'om positive clones utilizing techniques well known in the art. The DNA can then be 
amplified either in vivo or in vitro by utilizing any of the various amplification techniques known in the art. 
In vivo amplification would include transformation of the clone (s) or subclone (s) into a viable host, 
followed by growth of the host. In vitro amplification can be performed using techniques such as the 
polymerase chain reaction. 

Once amplified the identified sequences can be"evolved"or sequenced. 

Evolution [000247] In one aspect, the present invention manipulates the identified polynucleotides to 
generate and select for encoded variants with altered activity or specificity. Clones found to have the 
bioactiyity for which the screen was performed can be subjected to directed mutagenesis to develop new 
bioactivities with desired properties or to develop modified bioactivities with particularly desired properties 
that are absent or less pronounced in the wild-type activity, such as stability to heat or organic solvents. 
Any of the known techniques for directed mutagenesis are applicable to the invention. For example, 
mutagenesis techniques for use in accordance with the invention include those described below. 

[000248] Altematively, it may be desirable to variegate a polynucleotide sequence obtained, identified or 
cloned as described herein. Such variegation can modify the polynucleotide sequence in order to modify (e. 
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g. , increase or decrease) the encoded polypeptide's activity, specificity, affinity, function, etc. Such 
evolution methods are known in the art or described herein, such as, shuffling, cassette mutagenesis, 
recursive ensemble mutagenesis, sexual PCR, directed evolution, exonuc lease-mediated reassembly, codon 
site-saturation mutagenesis, amino acid site-saturation mutagenesis, gene site saturation mutagenesis, 
introduction of mutations by non-stochastic polynucleotide reassembly methods, synthetic ligation 
polynucleotide reassembly, gene reassembly, oligonucleotide-directed saturation mutagenesis, in vivo 
reassortment of polynucleotide sequences having partial homology, naturally occurring recombination 
processes which reduce sequence coniplexity, and any combination thereof. 

[000249] The clones enriched for a desired polynucleotide sequence, which are identified as described 
above, may be sequenced to identify the DNA sequence (s) present in the clone, which sequence 
information can be used to screen a database for similar sequences or functional characteristics. Thus, in 
accordance with the present invention it is possible to isolate and identify: (i) DNA having a sequence of 
interest (e. g. , a sequence encoding an enzyme having a specified enzyme activity), (ii) associate the 
sequence with known or vmknown sequence in a database (e. g., database sequence associated with an 
enzyme having an activity (including the amino acid sequence thereof)), and (iii) produce recombinant 
enzymes having such activity. 

[000250] Sequencing may be performed by high throughput sequencing techniques. The exact method of 
sequencing is not a limiting factor of the invention. Any method useful in identifying the sequence of a 
particular cloned DNA sequence can be used. In general, sequencing is an adaptation of the natural process 
of DNA replication. Therefore, a template (e. g. , the vector) and primer sequences are used. One general 
template preparation and sequencing protocol begins with automated picking of bacterial colonies, each of 
which contains a separate DNA clone which will function as a template for the sequencing reaction. 

The selected clones are placed into media, and grown overnight. The DNA templates are then purified from 
the cells and suspended in water. After DNA quantification, high-throughput sequencing is performed 
using a sequencer, such as Applied Biosystems, Inc. , Prism 377 DNA Sequencers. The resulting sequence 
data can may then be used in additional methods, including searching a database or databases. 

Database Searches and AUgnment Algorithms [000251] A number of source databases are available that 
contain either a nucleic acid sequence and/or a deduced amino acid sequence for use with the invention in 
identifying or determining the activity encoded by a particular polynucleotide sequence. All or a 
representative portion of the sequences (e. g. , about 100 individual clones) to be tested are used to search a 
sequence database (e. g. , GenBank, PFAM or ProDom), either simultaneously or individually. A number 
of different methods of performing such sequence searches are known in the art. The databases can be 
specific for a particular organism or a collection of organisms. For example, there are databases for the C. 
elegans, Arabadopsis. sp., M genitalium, M. jannaschii, E. coli, H. influenzae, S. cerevisiae among others. 
The sequence data of the clone is then aligned to the sequences in the database or databases, known in the 
art, using algorithms designed to measure homology between two or more sequences. 

[000252] Such sequence alignment methods include but are not limited to, for example, BLAST (Altschul 
et al. , 1990), BLITZ (MPsrch) (Sturrock & Collins, 1993), and FASTA (Person & Lipman, 1988). 
The probe sequence (e. g. , the sequence data from the clone) can be any length, and will be recognized as 
homologous based upon a threshold homology value. The threshold value may be predetermined, although 
this is not required. The threshold value can be based upon the particular polynucleotide length. To align 
sequences a number of different procedures can be used. Typically, Smith- Waterman or Needleman- 
Wunsch algorithms are used. However, as discussed faster procedures such as BLAST, FASTA, PSI- 
BLAST, and others known in the art may can be used. 

[000253] Non- limiting examples are as follows For example, optimal alignment of sequences for aligning a 
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comparison window may be conducted by the local homology algorithm of Smith (Smith and Wateraian, 
Adv Appl Math, 1981 ; Smith and Watemian, J Teor Biol, 1981 ; Smith and Watemian, J Mol Biol, 1981 ; 
Smith et al, J Mol Evol, 1981), by the homology alignment algorithm of Needleman (Needleman and 
Wuncsch, 1970), by the search of similarity method of Pearson (Pearson and Lipman, 1988), by 
computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the 
Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr. , Madison, 
WI, or the Sequence Analysis Software Package of the Genetics Computer, Group, University of 
Wisconsin, Madison, WI), or by inspection, and the best alignment (i. e., resulting in the highest percentage 
of homology over the comparison window) generated by the various methods is selected. The similarity of 
the two sequences (i. e. , the probe sequence and the database sequence) can then be predicted. 

[000254] Such software matches similar sequences by assigning degrees of homology to various deletions, 
substitutions and other modifications. The terms"homology"and "identity "in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same 
or have a specified percentage of amino acid residues or nucleotides that are the same when compared and 
aligned for maximum correspondence over a comparison window or designated region as measured using 
any number of sequence comparison algorithms or by manual alignment and visual inspection. 

[000255] For sequence comparison, typically one sequence acts as a reference sequence, to which test 
sequences are compared. When using a sequence comparison algorithm, test and reference sequences are 
entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm 
program parameters are designated. Default program parameters can be used, or altemative parameters can 
be designated. The sequence comparison algorithm then calculates the percent sequence identities for the 
test sequences relative to the reference sequence, based on the program parameters. 

[000256] A"comparison window", as used herein, includes reference to a segment of any one of the 
number of contiguous positions selected fi-om the group consisting of from 20 to 600, usually about 50 to 
about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference 
sequence of the same number of contiguous positions after the two sequences are optimally aligned. 

[000257] One example of an algorithm used in the methods of the invention is BLAST and BLAST 2.0 
algorithms, which are described in Altschul et al. , Nuc. Acids Res. 25: 3389-3402 (1977) and Altschul et 
al. , J. Mol. Biol. 215: 403-410 (1990), respectively. Software for performing BLAST analyses is publicly 
available through the National Center for Biotechnology Information. This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query 
sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word 
of the same length in a database sequence. T is referred to as the neighborhood word score threshold 
(Altschul et al. , supra). These initial neighborhood word hits act as seeds for initiating searches to find 
longer HSPs containing them. The word hits are extended in both directions along each sequence for as far 
as the cumulative alignment score can be increased. 

Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair 
of matching residues; always >0). The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a 
wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a comparison of both strands. 

[000258] The BLAST algorithm also performs a statistical analysis of the similarity between two 
sequences (see, e. g. , Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90: 5873 (1993)). 

One measure of similarity provided by BLAST algorithm is the smallest sum probability (P (N) ), which 
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provides an indication of the probability by which a match between two nucleotide sequences would occur 
by chance. For example, a nucleic acid is considered similar to a references sequence if the smallest sum 
probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 
preferably less than about 0. 01, and most preferably less than about 0. 001. 

[000259] Sequence homology (sequence identity) means that two polynucleotide sequences are 
homologous (i. e. , on a nucleotide-by -nucleotide basis) over the window of comparison. 

A percentage of sequence identity or homology is calculated by comparing two optimally aligned 
sequences over the window of comparison, determining the number of positions at which the identical 
nucleic acid base (e. g.. A, T, C, G, U, or I) occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the window of 
comparison (i. e. , the window size), and multiplying the result by 100 to yield the percentage of sequence 
homology. This substantial homology denotes a characteristic of a polynucleotide sequence, wherein the 
polynucleotide comprises a sequence having at least 60 percent sequence homology, typically at least 70 
percent homology, often 80 to 90 percent sequence homology, and most commonly at least 99 percent 
sequence homology as compared to a reference sequence of a comparison window of at least 25-50 
nucleotides, wherein the percentage of sequence homology is calculated by comparing the reference 
sequence to the polynucleotide sequence which may include deletions or additions which total 20 percent 
or less of the reference sequence over the window of comparison. 

[000260] Sequences having sufficient homology (sequence identity) can then be further identified by any 
annotations contained in the database, including, for example, species and activity information. 
Accordingly, in a typical mixed population sample, a plurality of nucleic acid sequences will be obtained, 
cloned, sequenced and corresponding homologous sequences from a database identified. This information 
provides a profile of the polynucleotides present in the sample, including one or more features associated 
with the polynucleotide including the organism and activity associated with that sequence or any 
polypeptide encoded by that sequence based on the database information. As used herein 
"fingerprint"or"profile"refers to the fact that each sample will have associated with it a set of 
polynucleotides characteristic of the sample and the environment from which it was derived. Such a profile 
can include the amount and type of sequences present in the sample, as well as information regarding the 
potential activities encoded by the polynucleotides and the organisms from which polynucleotides were 
derived. This unique pattem is each sample's profile or fingerprint. 

[000261] In some instances it may be desirable to express a particular cloned polynucleotide sequence once 
its identity or activity is determined or a demonstrated identity or activity is associated with the 
polynucleotide. In such instances the desired clone, if not akeady cloned into an expression vector, is 
ligated downstream of a regulatory control element (e. g. , a promoter or enhancer) and cloned into a 
suitable host cell. Expression vectors are commercially available along with corresponding host cells for 
use in the invention. 

[000262] Representative examples of expression vectors which may be used in the method of the present 
invention include but are not limited to, there may be mentioned viral particles, baculovirus, phage, 
plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, viral nucleic acid (e. g.', 
vaccinia, adenovirus, foul pox virus, pseudorabies and derivatives of SV40), PI -based artificial 
chromosomes, yeast plasmids, yeast artificial chromosomes, and any other vectors specific for specific 
hosts of interest (such as bacillus, Aspergillus, yeast, etc. ) Thus, for example, the DNA may be included in 
any one of a variety of expression vectors for expressing a polypeptide. Such vectors include, but are not 
limited to, chromosomal, nonchromosomal and synthetic DNA sequences. Large numbers of suitable 
vectors are known to those of skill in the art, and are commercially available. The following vectors are 
provided by way of example only: ZAP Express, Lambda ZAP&commat;-CMV, Lambda ZAP II, Lambda 
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gtlO, Lambda gtl 1, pMyr, pSos, pCMV-Script, pCMV-Script XR, pBK Phagemid, pBK-CMV, pBK-RSV, 
pBluescript n Phagemid, pBluescript H KS pBluescript 11 SK pBluescript II SK-, Lambda FIX II, 
Lambda DASH II, Lambda EMBL3 and EMBL4, EMBL3, EMBL4, SuperCos I and pWE15, pWE15, 
SuperCos I, pPCR-Script Amp, pPCR-Script Cam, pCMV-Script, pBC KS +, pBC KS-, pBC SK +, pBC 
SK-, psiX174, pNH8A, pNH16a, pNH18A, pNH46A (Stratagene); PT7BLUE, pSTBlue, pCITE, pET, 
ptriEx, pForce (Novagen); pIND-E, pIND Vector, pIND/Hygro, pIND (SPl)/Hygro, pIND/GFP, pIND 
(SP1)/GFP, pINDA^S-His and pIND (SP1)A^5-His Tag, pIND TOPO TA, shooter Targeting Vectors, 
pTracerTM GFP Reporter Vectors, pcDNAO Vector Collection, EBV Vectors, Voyager VP22 Vectors, 
pVAXI-DNA vaccine vector, pcDNA4/His-Max, pBCl Mouse Milk System (Invitrogen); pQE70, pQE60, 
pQE-9, pQE-16, pQE-30/pQE-80, pQE 31/pQE 81, pQE-32/pQE 82, pQE-40, pQE-100 Double Tag 
(Qiagen); pTRC99a, pKK223-3, pKK233-3, pDR540, pRIT5, pWLNEO, pSV2CAT, pOG44, pXTl, pSG 
(Stratagene), pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, one skilled in the art would understand 
that any other plasmid or vector may be used as long as they are replicable and viable in the host. 

[000263] The nucleic acid sequence in the expression vector can be operatively linked to an appropriate 
expression control sequence (s) (promoter) to direct mRNA synthesis. Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda PR, PL, SP6, trp, lacUVS, PBAD, araBAD, araB, trc, proU, p-D- 
HSP, HSP, GAL4 UAS/Elb, TK, GALl, CMV/Tet02 Hybrid, EF-la CMV, EF-la CMV, EF-la CMV, EF, 
EF-la, ubiquitin C, rsv-ltr, rsv, b- lactamase, nmtl, and gallo. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse 
metallothionein-I. 

Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art. The 
expression vector also contains a ribosome binding site for translation initiation and a transcription 
terminator. The vector may also include appropriate sequences for amplifying expression. Promoter regions 
can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors 
with selectable markers. 

[000264] In addition, the expression vectors may can contain one or more selectable marker genes to 
provide a phenotypic trait for selection of transformed host cells such as dihydrofolate reductase or 
neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampicillin resistance in E. coli. 

[000265] The nucleic acid sequence (s) selected, cloned and sequenced as hereinabove described can 
additionally be introduced into a suitable host to prepare a library which is screened for the desired enzyme 
activity. The selected nucleic acid is preferably already in a vector which includes appropriate control 
sequences whereby a selected nucleic acid encoding an enzyme may be expressed, for detection of the 
desired activity. The host cell can be a higher eukaryotic cell, such as a mammalian cell, or a lower 
eukaryotic cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. The 
selection of an appropriate host is deemed to be within the scope of those skilled in the art from the 
teachings herein. 

[000266] In some instances it may be desirable to perform an amplification of the nucleic acid sequence 
present in a sample or a particular clone that has been isolated. In this aspect the nucleic acid sequence is 
amplified by PCR reaction or similar reaction known to those of skill in the art. Commercially available 
amplification kits are available to carry out such amplification reactions. 

[000267] In another aspect, amplification of the nucleic acid sequence may be done by multiple 
displacement amplification (MDA) or by rolling circle amplification (RCA). (See for example U. S. 
USSNs : 60/573,473, filed May 21,2004, and 10/633,248, filed July 31, 2003, and 09/875,412, filed June 
1,2001, each incorporated herein by reference in their entirety. ) [000268] In addition, it is important to 
recognize that the alignment algorithms and searchable database can be implemented in computer 
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hardware, software or a combination thereof. Accordingly, the isolation, processing and identification of 
nucleic acid sequences and the corresponding polypeptides encoded by those sequence can be implemented 
in and automated system. 

Capillary-Based Screening [000269] In one aspect, the invention provides systems and methods using 
capillary based screening of biomolecules. Figure 6 A shows a capillary array (10) which includes a 
plurality of individual capillaries (20) having at least one outer wall (30) defining a lumen (40). The outer 
wall (30) of the capillary (20) can be one or more walls fused together. Similarly, the wall can define a 
lumen (40) that is cylindrical, square, hexagonal or any other geometric shape so long as the walls form a 
lumen for retention of a liquid or sample. The capillaries (20) of the capillary array (10) are held together in 
close proximity to form a planar structure. 

The capillaries (20) can be bound together, by being fused (e. g. , where the capillaries are made of glass), 
glued, bonded, or clamped side -by-side. The capillary array (10) can be formed of any number of 
individual capillaries (20). In an aspect, the capillary array includes 100 to 4, 000,000 capillaries (20), In 
one aspect, the capillary array includes 100 to 500,000, 000 capillaries (20). In one aspect, the capillary 
array includes 100,000 capillaries (20). In one specific aspect, the capillary array (10) can be formed to 
conform to a microtiter plate footprint, i. e. 127.76mm by 85.47nun, with tolerances. The capillary array 
(10) can have a density of 500 to more than 1,000 capillaries (20) per cm2, or about 5 capillaries per mm2. 
For example, a microtiter plate size array of 3um capillaries would have about 500 million capillaries. 

[000270] The capillaries (20) can be formed with an aspect ratio of 50: 1. In one aspect, each capillary (20) 
has a length of approximately 10mm, and an internal diameter of the lumen (40) of approximately 200um. 
However, other aspect ratios are possible, and range from 10: 1 to well over 1000: 1. Accordingly, the 
thickness of the capillary array can vary from 0. 5mm to over 10cm. Individual capillaries (20) have an 
inner diameter that ranges from 3- 500um and 0-500, um. A capillary (20) having an intemal diameter of 
200 Jim and a length of 1 cm has a volume of approximately 0. 3 ul. The length and width of each capillary 
(20) is based on a desired volume and other characteristics discussed in more detail below, such as 
evaporation rate of liquid from within the capillary, and the like. Capillaries of the invention may include a 
volume as low as 250 nanoliters/well. 

[000271] In accordance with one aspect of the invention, one or more particles are introduced into each 
capillary (20) for screening. Suitable particles include cells, cell clones, and other biological matter, 
chemical beads, or any other particulate matter. The capillaries (20) containing particles of interest can be 
introduced with various types of substances for causing an activity of interest. The introduced substance 
can include a liquid having a developer or nutrients, for example, which assists in cell growth and which 
results in the production of enzymes. Or, a chemical solution containing new particles can cause a 
combining event with other chemical beads already introduced into one or more capillaries (20). The 
particles and resulting activity of interest are screened and analyzed using the capillary array (10) according 
to the present invention. In one aspect, the activity produces a change in properties of matter within the 
capillary (20), such as optical properties of the particles. Each capillary can act as a waveguide for guiding 
detectable light energy or property changes to an analyzer. The capillaries (20) can be made according to 
various manufacturing techniques. In one particular aspect, the capillaries (20) are manufactured using a 
hollow-drawn technique. A cylindrical, or other hollow shape, piece of glass is drawn out to continually 
longer lengths according to known techniques. The piece of glass is preferably formed of multiple layers. 
The drawn glass is then cut into portions of a specific length to form a relatively large capillary. The 
capillary portions are next bundled into an array of relatively large capillaries, and then drawn again to 
increasingly narrower diameters. 

During the drawing process, or when the capillaries are formed to a desired width, application of heat can 
fiise interstitial areas of adjacent capillaries together. 
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[000272] In an alternative aspect, a glass etching process is used. A solid tube of glass can be drawn out to 
a particular width, cut into portions of a specific length, and drawn again. 

Then, each.sohd tube portion is center-etched with an acid or other etchant to form a hollow capillary. The 
tubes can be bound or fused together before or after the etch process. A number of capillary arrays (10) can 
be connected together to form an array of arrays (12), as shown in Figure 6B. The capillary arrays (10) can 
be glued together. Alternatively, the capillary arrays (10) can be fused together. According to this 
technique, the array of arrays (12) can have any desired size or footprint, formed of any number of high- 
precision capillary arrays (10). 

[000273] A large number of materials can be suitably used to form a capillary array according to the 
invention and depending on the manufacturing technique used, including without limitation, glass, metal, 
semiconductors such as silicon, quartz, ceramics, or various polymers and plastics including, among others, 
polyethylene, polystyrene, and polypropylene. The intemal walls of the capillary array, or portions thereof, 
may be coated or silanized to modify their surface properties. For example, the hydrophilicity or 
hydrophobicity may be altered to promote or reduce wicking or capillary action, respectively. 

The coating material includes, for example, ligands such as avidin, streptavidin, antibodies, antigens, and 
other molecules having specific binding affinity or which can withstand thermal or chemical sterilization. 

[000274] While the above-described manufacturing techniques and materials yield high precision micro- 
sized capillaries and capillary arrays, the size, spacing and aligrmient of the capillaries within an array may 
be non-uniform. In some instances, it is desirable to have two capillary arrays make contact in as close 
aligimient as possible, such as, for example, to transfer liquid from capillaries in a first capillary array to 
capillaries in a second capillary array. One capillary array according to the invention may be cut 
horizontally along its thickness, and separated to form two capillary arrays. The two resulting capillary 
arrays will each include at least one surface having capillary openings of substantially identical size, 
spacing and alignment, and suitable for contacting together for transferring liquid from one resulting 
capillary array to the other. 

[000275] Figure 7 shows a horizontal cross section of a portion of an array of capillaries (20). Capillary 
(20) is shown having a first cylindrical wall (30), a lumen (40), a second exterior wall (50), and interstitial 
material (60) separating the capillary tubes in the array (10). 

In this aspect, the cylindrical wall (30) is comprised of a sleeve glass, while exterior wall (50) is comprised 
of an extra mural absorption (EMA) glass to minimize optical cross-talk among neighboring capillaries 
(20). 

[000276] A capillary array may optionally include reference indicia (22) for providing a positional or 
alignment reference. The reference indicia (22) may be formed of a pad of glass extending firom the surface 
of the capillary array, or embedded in the interstitial material (60). 

In one aspect, the reference indicia (22) are provided at one or more comers of a microtiter plate formed by 
the capillary array. According to the aspect, a comer of the plate or set of capillaries may be removed, and 
replaced with the reference indicia (22). The reference indicia (22) may also be formed at spaced intervals 
along a capillary array, to provide an indication of a subset of capillaries (20). 

[000277] Figure 8 depicts a vertical cross-section of a capillary of the invention. The capillary (20) 
includes a first wall (30) defining a lumen (40), and a second wall (50) surrounding the first wall (30). In 
one aspect, the second wall (50) has a lower index of refraction than the first wall (30). In one aspect, the 
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first wall (30) is sleeve glass having a high index of refraction, forming a waveguide in which light from 
excited fluorophores travels. In the exemplary aspect, the second wall (50) is black EMA glass, having a 
low index of refraction, forming a cladding around the first wall (30) against which light is refracted and 
directed along the first wall (30) for total internal reflection within the capillary (20). The second wall (50) 
can thus be made with any material that reduces the"cross-talk" or diffusion of light between adjacent 
capillaries. Alternatively, the inside surface of the first wall (30) can be coated with a reflective substance 
to form a mirror, or mirror-like structure, for specular reflection within the lumen (40). 

[000278] Many different materials can be used in forming the first and second walls, creating different 
indices of refraction for desired purposes. A filtering material can be formed around the lumen (40) to filter 
energy to and from the lumen (40) as depicted in Figure 9. In one aspect, the inner wall of the first wall 
(30) of each capillary of the array, or portion of the array, is coated with the filtering material. In another 
aspect, the second wall (50) includes the filtering material. For instance, the second-wall (50) can be 
formed of the filtering material, such as filter glass for example, or in one exemplary aspect, the second 
wall (50) is EMA glass that is doped with an appropriate amount of filtering material. The filtering material 
can be formed of a color other than black and tuned for a desired excitation/emission filtering 
characteristic. 

[000279] The filtering material allows transmission of excitation energy into the lumen (40), and blocks 
emission energy from the lumen (40) except through one or more openings at either end of the capillary 
(20). In Figure 9, excitation energy is illustrated as a solid line, while emission energy is indicated by a 
broken line. When the second wall (50) is formed with a filtering material as shown in Figure 9, certain 
wavelengths of light representing excitation energy are allowed through to the lumen (40), and other 
wavelengths of light representing emission energy are blocked from exiting, except as directed within and 
along the first wall (30). The entire capillary array, or a portion thereof, can be tuned to a specific 
individual wavelength or group of wavelengths, for filtering different bands of light in an excitation and 
detection process. 

[000280] A particle (70) is depicted within the lumen (40). Diu-ing use, an excitation light is directed into 
the lumen (40) contacting the particle (70) and exciting a reporter fluorescent material causing emission of 
light. The emitted light travels the length of the capillary until it reaches a detector. One advantage of an 
aspect of the present invention, where the second wall (50) is black EMA glass, is that the emitted light 
cannot cross contaminate adjacent capillary tubes in a capillary array. In addition, the black EMA glass 
refracts and directs the emitted light towards either end of the capillary tube thus increasing the signal 
detected by an optical detector (e. g. , a CCD camera and the like). 

[000281] In a detection process using a capillary array of the invention, an optical detection system is 
aligned with the array, which is then scanned for one or more bright spots, representing either a 
fluorescence or luminescence associated with a"positive. "The term "positive"refers to the presence of an 
activity of interest. Again, the activity can be a chemical event, or a biological event. 

[000282] Figure 10 depicts a general method of sample screening using a capillary array (10) according to 
the invention. In this depiction, capiilary array (KJ) : ! s mamersed or contacted with a container (100) 
containing particles of interest. The particles can be cells, clones, molecules or compounds suspended in a 
liquid. The liquid is wicl@ed into the capillary tubes by capillary action. The natural wicking that occurs as 
a result of capillary forces obviates the need for pumping equipment and liquid dispensers. A substrate for 
measuring biological activity (e. g. , enzyme activity) can be contacted with the particles either before or 
after introduction of the particles into the capillaries in the capillary array. The substrate can include clones 
of a cell of interest, for example. The substrate can be introduced simultaneously into the capillaries by 
placing an open end of the capillaries in the container (100) containing a mixture of the particle-bearing 
liquid and the substrate. In some aspects, it is a goal to achieve a certain concentration of particles of 
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interest. A particular concentration of particles may also be achieved by dilution. Figures 13A-C show one 
such process, which is described below. 

[000283] Alternatively, the particle-bearing liquid may be wicked a portion of the way into the capillaries, 
and then the substrate is wicked into a remaining portion of the capillaries. 

The mixture in the capillaries can then be incubated for producing a desired activity. The incubation can be 
for a specific period of time and at an appropriate temperature necessary for cell growth, for example, or to 
allow the substrate to permeabilize the cell membrane to produce an optically detectable signal, or for a 
period of time and at a temperature for optimum enzymatic activity. The incubation can be performed, for 
example, by placing the capillary array in a humidified incubator or in an apparatus containing a water 
source to ensure reduced evaporation within the capillary tubes. Evaporative loss may be reduced by 
increasing the relative humidity (e. g. , by placing the capillary array in a humidified chamber). 

The evaporation rate can also be reduced by capping the capillaries with an oil, wax, membrane or the like. 
Altematively, a high molecular weight fluid such as various alcohols, or molecules capable of forming a 
molecular monolayer, bilayer or other thin films (e. g., fatty acids), or various oils (e. g. , mineral oil) can 
be used to reduce evaporation. 

[000284] Figure 1 1 illustrates a method for incubating a substrate solution containing cells of interest. 
While only a single capillary (20) is shown in Figure 1 1 for simplicity, it should be understood that the 
incubation method apphes to a capillary array having a plurality of capillaries (20). In accordance with one 
aspect, a first fluid is wicked into the capillary (20) according to methods described above. The capillary 
(20) containing the substrate solution and cells (32) is then introduced to a fluid bath (70) containing a 
second liquid (72). The second liquid may or may not be the same as the first. For instance, the first liquid 
may contain particles (32) from which an activity is screened. The particles (32) are suspended in liquid 
within the lumen (40), and gradually migrate toward the top of the lumen (40) in the direction of the flow 
of liquid through the capillary (20) due to evaporation. The width of the lumen (40) at the open end of the 
capillary (20) is sized to provide a particular surface area of liquid at the top of the lumen (40), for 
controlling the amount and rate of evaporation of the liquid mixture. By controlling the environment (68) 
near the non-submersed end of the capillary (20), the first liquid from within the capillary (20) will 
evaporate, and will be replenished by the second liquid (72) from the fluid bath (70). 

[000285] The amount of evaporation is balanced against possible diffusion of the contents of the capillary 
(20) into the liquid (72), and against possible mechanical mixing of the capillary contents with the Hquid 
(72) due to vibration and pressure changes. The greater the width of the lumen (40), the larger the amount 
of mechanical mixing. Therefore, the temperature and humidity level in the surrounding environment may 
be adjusted to produce the desired evaporative cycle, and the lumen (40) width is sized to minimize 
mechanical mixing, in addition to produce a desired evaporation rate. The non-submersed open end of the 
capillary (20) may also be capped to create a vacuum force for holding the capillary contents within the 
capillary, and minimizing mechanical mixing and diffusion of the contents within the liquid (72). However 
when capped, the capillary (20) will not experience evaporation. 

[000286] The liquid (72) can be supplemented with nutrients (74) to support a greater likelihood or rate of 
activity of the particles (32). For example, oxygen can be added to the liquid to nourish cells or to optimize 
the incubation environment of the cells. In another example, the liquid (72) can contain a substrate or a 
recombinant clone, or a developer for the particles (32). The cells can be optimally cultured by controlling 
the amount and rate of evaporation. For instance, by decreasing relative humidity of the environment (68), 
evaporation from the lumen (40) is increased, thereby increasing a rate of flow of hquid (72) through the 
capillary (20). Another advantage of this method is the ability to control conditions within the capillary 
(20) and the environment (68) that are not otherwise possible. 
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[000287] A relatively high humidity level of the environment v^ill slow the rate of evaporation and keep 
more liquid within the capillary (20). If a temperature differential exists between a capillary array (10) and 
its environment, however, condensation can form on or near the ends of tightly -packed capillaries of the 
capillary array. Figure 12A shows a portion of a capillary array (10) of the invention, to depict a situation 
in which a condensation bead (80) forms on the outer edge surface of several capillary walls (30), creating 
a potential conduit or bridge for"cross-talk"of matter between adjacent capillary tubes (20). The outer edge 
surface of the capillary walls (30) is preferably a planar surface. In an aspect in which the wall (30) of the 
capillary (20) is glass, the outer edge surface of the capillary wall (30) can be polished glass. 

[000288] In order to minimize the effects of such condensation, a hydrophobic coating (35) is provided 
ove;r the outer edge surface of the capillary walls (30), as depicted in Figure 12B. 

The coating (35) reduces the tendency for water or other liquid to accumulate near the outer edge surface of 
the capillary wall (30). Condensation will form either as smaller beads (82), be repelled from the surface of 
the capillary array, or form entirely over an opening to the lumen (40). In the latter case, the condensation 
bead (80) can form a cap to the capillary (20). In one aspect, the hydrophobic coating (35) is TEFLON. In 
one configuration, the coating (35) covers only the outer edge surfaces of the capillary walls (30). In 
another configuration, the coating (35) can be formed over both the interstitial material (60) and the outer 
edge surfaces of the capillary walls (30). Another advantage of a hydrophobic coating (35) over the outer 
edge surface of the capillary tubes is during the initial wicking process, some fluidic material in the form of 
droplets will tend to stick to the surface in which the fluid is introduced. Therefore, the coating (35) 
minimizes extraneous fluid fi-om forming on the surface of a capillary array (10), dispensing with a need to 
shake or knock the extraneous fluid from the surface. 

[000289] In some instances, it is necessary to have more than one component in a capillary that are not 
premixed, and which can by later combined by dilution or mixing. Figures 13 A- C show a dilution process 
that may be used to achieve a particular concentration of particles. 

In one aspect employing dilution, a bolus of a first component (82) is wicked into a capillary (20) by 
capillary action until only a portion of the capillary (20) is filled. In one particular aspect, pressure is 
applied at one end of the capillary (20) to prevent the first component from wicking into the entire capillary 
(20). The end (21) of the capillary may be completely or partially capped to provide the pressure. An 
amount of air (84) is then introduced into the capillary adjacent the first component. The air (84) can be 
introduced by any number of processes. One such process includes moving the first component (82) in one 
direction within the capillary until a suitable amount of the air (84) is introduced behind the first 
component (82). Further movement of the first component (82) by a pulling and/or pushing pressure causes 
a piston-like action by the first component (82) on the air. The capillary (20) or capillary array is then 
contacted to a second component (86). The second component (86) is preferably pulled into the capillary 
(20) by the piston-like action created by movement of the first component (82), imtil a suitable amount of 
tiie second component (86) is provided in the capillary, separated firom the first component by the air (84). 
One of the first or second components may contain one or more particles of interest, and the other of the 
components may be a developer of the particles for causing an activity of interest. The capillary or capillary 
array can then be incubated for a period of time to allow the first and second components to reach an 
optimal temperature, or for a sufficient time to allow cell growth for example. The air-bubble separating the 
two components can be disrupted in order to allow mix the two components together and initialize the 
desired activity. Pressure can be applied to collapse the bubble. In one example, the mixture of the first and 
second components starts an enzymatic activity to achieve a multi-component assay. 

[000290] Paramagnetic beads contained within a capillary (20) can be used to disrupt the air bubble and/or 
mix the contents of the capillary (20) or capillary array (10). For example. Figure 14A and 9B depict an 
aspect of the invention in which paramagnetic beads are magnetically moved fi-om one location to another 
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location. The paramagnetic beads are attracted by magnetic fields applied in proximity to the capillary or 
capillary array. By alternating or adjusting the location of the magnetic field with respect to each capillary, 
the paramagnetic beads will move within each capillary to mix the liquid therein. Mixing the liquid can 
improve cell growth by increasing aeration of the cells. The method also improves consistency and 
detectability of the liquid sample among the capillaries. 

[000291] In another aspect, a method of forming a multi-component assay includes providing one or more 
capsules of a second component within a first component. The second component capsules can have an 
outer layer of a substance that melts or dissolves at a predetermined temperature, thereby releasing the 
second component into the first component and combining particles among the components. A thermally 
activated enzyme may be used to dissolve the outer layer substance. Alternatively, a"release on 
command"mechanism that is configured to release the second component upon a predetermined event or 
condition may also be used. 

[000292] In another aspect, recombinant clones containing a reporter construct or a substrate are wicked 
into the capillary tubes of the capillary array. In this aspect, it is not necessary to add a substrate as the 
reporter construct or substrate contained in the clone can be readily detected using techniques known in the 
art. For example, a clone containing a reporter construct such as green fluorescent protein can be detected 
by exposing the clone or substrate within the clone to a wavelength of light that induces fluorescence. Such 
reporter constructs can be implemented to respond to various culture conditions or upon exposure to 
various physical stimuli (including light and heat). In addition, various compounds can be screened in a 
sample using similar techniques. For example, a compound detectably labeled with a florescent molecule 
can be readily detected within a capillary tube of a capillary array. 

[000293] In yet another aspect, instead of dilution, a fluorescence-activated cell sorter (FACS) is used to 
detect, separate and isolate clones for delivery into the capillary array. In accordance with this aspect, one 
or more clones per capillary tube can be precisely achieved. 

In yet another aspect, cells within a capillary are subjected to a lysis process. A chemical is introduced 
within one of the components to cause a lysis process where the cells burst. 

[000294] Some assays may require an exchange of media within the capillary. In a media exchange 
process, a first liquid containing the particles is wicked into a capillary. The first liquid is removed, and 
replaced with a second liquid while the particles remain suspended within the capillary. Addition of the 
second liquid to the capillary and contact with the particles can initialize an activity, such as an assay, for 
example. The media exchange process may include a mechanism by which the particles in the capillary are 
physically maintained in the capillary while the first liquid is removed. In one aspect, the inner walls of the 
capillary array are coated with antibodies to which cells bind. Then, the first liquid is removed, while the 
cells remain boxmd to the antibodies, and the second liquid is wicked into the capillary. The second liquid 
could be adapted to cause the cells to unbind if desirable. In an altemative aspect, one or more walls of the 
capillary can be magnetized. The particles are also magnetized and attracted to the walls. In still another 
aspect, magnetized particles are attracted and held against one side of the capillary upon application of a 
magnetic field near that side, 

[000295] The capillary array is analyzed for identification of capillaries having a detectable signal, such as 
an optical signal (e. g. , fluorescence), by a detector capable of detecting a change in light production or 
light transmission, for example. Detection may be performed using an illumination source that provides 
fluorescence excitation to each of the capillaries in the array, and a photodetector that detects resulting 
emission from the fluorescence excitation. Suitable illumination sources include, without limitation, a laser, 
incandescent bulb, light emitting diode (LED), arc discharge, or photomultiplier tube. Suitable 
photodetectors include, without limitation, aphotodiode array, a charge-coupled device (CCD), or charge 
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injection device (CID). 

[000296] In one aspect, shown with reference to Figure 15, a detection system includes a laser source (82) 
that produces a laser beam (84). The laser beam (84) is directed into a beam expander (85) configured to 
produce a wider or less divergent beam (86) for exciting the array of capillaries (20). Suitable laser sources 
include argon or ion lasers. For this aspect, a cooled CCD can be used. 

[000297] The light generated by, for example, enzymatic activation of a fluorescent substrate is detected by 
an appropriate light detector or detectors positioned adjacent to the apparatus of the invention. The light 
detector may be, for example, film, a photomultiplier tube, photodiode, avalanche photo diode, CCD or 
other light detector or camera. The light detector may be a single detector to detect sequential emissions, 
such as a scanning laser. Or, flie light detector may include a plurality of separate detectors to detect and 
spatially resolve simultaneous emissions at single or multiple wavelengths of emitted light. The light 
emitted and detected may be visible light or may be emitted as non- visible radiation such as infrared or 
ultraviolet radiation. A thermal detector may be used to detect an infrared emission. The detector or 
detectors may be stationary or movable. 

[000298] Illumination can be channeled to particles of interest within the array by means of lenses, mirrors 
and fiber optic light guides or light conduits (single, multiple, fixed, or moveable) positioned on or adjacent 
to at least one surface of the capillary array. A detectable signal, such as emitted light or other radiation, 
may also be channeled to the detector or detectors by the use of such mechanisms. The photodetector can 
comprise a CCD, CID or an array of photodiode elements. Detection of a position of one or more 
capillaries having an optical signal can then be determined fi-om the optical input from each element. 
Altematively, the array may be scanned by a scanning confocal or phase-contrast fluorescence microscope 
or the like, where the array is, for example, carried on a movable stage for movement in a X-Y plane as the 
capillaries in the array are successively aligned with the beam to determine the capillary array positions at 
which an optical signal .is detected. 

A CCD camera or the like can be used in conjunction with the microscope. The detection system can be a 
computer-automated for rapid screening and recovery. In one aspect, the system uses a telecentric lens for 
detection. The magnification of the lens can be adjusted to focus on a subset of capillaries in the capillary 
array. At one extreme, for instance, the detection system can have a 1: 1 correlation of pixels to capillaries. 
Upon detecting a signal, the focus can be adjusted to determine other properties of the signal. Having more 
pixels per capillary allows for subsequent image processing of the signal. 

[000299] Where a chromogenic substrate is used, the change in the absorbance spectrum can be measured, 
such as by using a spectrophotometer or the like. Such measurements are usually difficult when dealing 
with a low-volume liquid because the optical path length is short. However, the capillary approach of the 
present invention permits small volumes of liquid to have long optical path lengths (e. g. , longitudinally 
along the capillary tube), thereby providing the ability to measure absorbance changes using conventional 
techniques. 

[000300] A fluid within a capillary will usually form a meniscus at each end. Any light entering the 
capillary will be deflected toward the wall, except for paraxial rays, which enter the meniscus curvature at 
its center. The paraxial rays create a small bright spot in middle of capillary, representing the small amount 
of light that makes it through. Measurement of the bright spot provides an opportunity to measure how 
much light is being absorbed on its way through. In one aspect, a detection system includes the use of two 
different wavelengths. A ratio between a first and a second wavelength indicates how much light is 
absorbed in the capillary. Altematively, two images of the capillary can be taken, and a difference between 
them can be used to ascertain a differential absorbance of a chemical within the capillary. 
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[000301 1 In absorbance detection, only light in the center of the lumen can travel through the capillary. 
However, if at least one meniscus is flattened, the optical efficiency is improved. 

The meniscus can be kept flat under a number of circumstances, such as during a continuous cycle of 
evaporation, discussed above with reference to Figure 1 1. In that aspect, the fluid bath can be contained in a 
clear, light-passing container, and the light source can be directed through the fluid bath into the capillary. 

[000302] In another aspect, bioactivity or a biomolecule or compound is detected by using various 
electromagnetic detection devices, including, for example, optical, magnetic and thermal detection. In yet 
another aspect, radioactivity can be detected within a capillary tube using detection methods known in the 
art. The radiation can be detected at either end of the capillary tube. Other detection modes include, without 
limitation, luminescence, fluorescence polarization, time-resolved fluorescence. Luminescence detection 
includes detecting emitted light that is produced by a chemical or physiological process associated with a 
sample molecule or cell. Fluorescence polarization detection includes excitation of the contents of the 
lumen with polarized light. Under such environment, a fluorophore emits polarized light for a particular 
molecule. However, the emitting molecule can be moving and changing its angle of orientation, and the 
polarized light emission could become random. 

[000303] Time-resolved fluorescence includes reading the fluorescence at a predetermined time after 
excitation. For a relatively long-life fluorophore, the molecule is flashed with excitation energy, which 
produces emissions from the fluorophore as well as from other particles within the substrate. Emissions 
from the other particles cause backgroimd fluorescence. The background fluorescence normally has a short 
lifetime relative to the long- life emission from the fluorophore. The emission is read after excitation is 
complete, at a time when all backgroxmd fluorescence usually has short lifetime, and during a time in which 
the long-life fluorophores continues to fluoresce. Time-resolved fluorescence is therefore a technique for 
suppressing backgroimd fluorescent activity. 

[000304] Recovery of putative hits (cells or clones producing a detectable or optical signal) can be 
facilitated by using position feedback from the detection system to automate positioning of a recovery 
device (e. g. , a needle pipette tip or capillary tube). Figure 16 shows an example of a recovery system 
(100) of the invention. In this example, a needle 105 is selected and connected to recovery mechanism 
(106). A support table (102) supports a capillary array (10) and a Ught source (104). The light source is 
used with a camera assembly (1 10) to find an X, Y and Z coordinate location of a needle (105) connected 
to the recovery mechanism (106). The support table is moved relative to the capillary array in the X and Y 
axes, in order to place the capillary array (10) underneath the needle (105), where the capillary array (10) 
contains a"hit. "According to various aspects, each section of a recovery system can be moved or kept 
stationary. 

[000305] The recovery mechanism (106) then provides a needle (105) to a capillary containing a"hit"by 
overlapping the tip of the needle (105) with the capillary containing the "hit, "in the Z direction, imtil the 
tip of the needle engages the capillary opening. In order to avoid damage to the capillary itself the needle 
may be attached to a spring or be of a material that flexes. Once in contact with the opening of the capillary 
the sample can be aspirated or expelled from the capillary. Alternatively, the capillary array may be moved 
relative to a stationary needle (105), or both moved. 

[000306] In a specific exemplary aspect of a recovery technique, a single camera is used for determining a 
location of a recovery tool, such as the tip of a needle, in the Z-plane. The Z- plane detertnination can be 
accomplished using an auto-focus algorithm, or proximity sensor used in conjunction with the camera. 
Once the proximity of the recovery tool in Z is known, an image processing fimction can be executed to 
determine a precise location of the recovery tool in X and Y. In one aspect, the recovery tool is back-lit to 
aid the image processing. 
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Once the X and Y coordinate locations are known, the capillary array can be moved in X and Y relative to 
the precise location of the recovery tool, which can be moved along the Z axis for coupling with a target 
capillary. 

[000307] In an altemative specific aspect of a recovery technique, two or more cameras are used for 
determining a location of the recovery tool. For instance, a first camera can determine X and Z coordinate 
locations of the recovery tool, such as the X, Z location of a needle tip. A second camera can determine Y 
and Z coordinate locations of the recovery tool. The two sets of coordinates can then be multiplexed for a 
complete X, Y, Z coordinate location. Next, the movement of the capillary array relative to the recovery 
tool can be executed substantially as above. 

[00030S] The sample can be expelled by, for example, injecting a blast of inert gas or fluid into the 
capillary and collecting the ejected sample in a collection device at the opposite end of the capillary. The 
diameter of the collection device can be larger than or equal to the diameter of the capillary. The collected 
sample can then be further processed by, for example, extracting polynucleotides, proteins or by growing 
the clone in culture. 

[000309] In another aspect, the sample is aspirated by use of a vacuum. In this aspect, the needle contacts, 
or nearly contacts, the capillary opening and the sample is"vacuumed"or aspirated from the capillary tube 
onto or into a collection device. The collection device may be a microfuge tube or a filter located proximal 
to the opening of the needle, as depicted in Figure 17A-D. Figure 17D shows further processing of a 
sample collected onto a filter following aspiration of the sample from the capillary. The saniple includes 
particles, such as cells, proteins, or nucleic acids, which when present on the filter, can be delivered into a 
collection device. Suitable collection devices include a microfuge tube, a capillary tube, microtiter plate, 
cell culture plate, and the like. The delivery of the sample can be accomplished by forcing another media, 
air or other fluid through the filter in the reverse direction. 

[000310] The sample can also be expelled from a capillary by a sample ejector. In one aspect; the ejector is 
a jet system where sample fluid at one end of the capillary tube is subjected to a high temperature, causing 
fluid at the other end of the capillary tube to eject out. The heating of fluid can be accomplished 
mechanically, by applying a heated probe directly into one end of a capillary tube. The heated probe 
preferably seals the one end, heats fluid in contact with the probe, and expels fluid out the other end of the 
capillary tube. The heating and expulsion may also be accomplished electronically. For instance, in an 
aspect of the jet system, at least one wall of a capillary tube is metalized. A heating element is placed in 
direct contact with one eiid of the wall. The heating element may completely close off the one end, or 
partially close the one end. The heating element charges up the metalized wall, which generates heat within 
the fluid. The heating element can be an electricity source, such as a voltage source, or a current source. In 
still yet another aspect of a jet system, a laser applies heat pulses to the fluid at one end of the capillary 
tube. 

[000311] Other systems for expelling fluid from a capillary tube of the invention are possible. An electric 
field may be created in or near the fluid to create an electrophoretic reaction, which causes the fluid to 
move according to electromotive force created by the electric field. An electromagnetic field may also be 
used. In one aspect, one or more capillaries contain, in addition to the fluid, magnetically charged particles 
to help move the fluid or magnetized particles out of the capillary array. Each capillary of an array of 
capillaries is individually addressable, i. e. the contents of each well can be ascertained during screening. In 
one aspect, a quantum-dot-tagged microbead method and arrangement is used. 

In such a method and arrangement, tens of thousands of imique fluorescent codes can be generated. The 
assay of interest is attached to a coded bead, and multi-spectral imaging is used to measure both the assay 
and the beads/codes simultaneously. There will always be some capillaries that get multiple beads and 
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some that get none. 

[000312] For an array which contains approximately 100,000 capillaries, one approach is to fill the 100,000 
capillaries of the array with a solution that contains 10 copies of 10,000 different coded beads (or 5 copies 
of 20,000 codes). Under normal conditions, simple statistical analysis can be used to determine which of 
the wells have single beads and maybe even the contents of every well. The chance of having any two 
beads together in a well more than five times on any one capillary array platform is negligibly small. 

[000313] An advantage of the quantum-dots method is that only a single excitation band is needed. This 
allows a lot of flexibility for the assay (i. e. it can use a different excitation band). Magnetic-coded beads 
may also be used to add another dimension to the assay detection. A multi-spectral imaging system can 
then be used. Altematively, a neural network application can be utilized for spectral decomposition. 

[000314] The myriad of microbes inhabiting this planet represent a tremendous repository of biomolecules 
for pharmaceutical, agricultural, industrial and chemical applications. The great majority of these microbes, 
estimated at near 99.5%, have remained uncultured by modem microbiological methods due in large part to 
the complex chemistries and environmental variables encountered in extreme or unusual biotopes. Taking 
advantage of enzymes catalyzing chemical reactions in novel pathways and evolved to function under 
environmental extremes is of great industrial significance. This invention provides systems and methods to 
extract, optimize and commercialize this robust catalytic diversity, within culture-independent, recombinant 
approaches for the discovery of novel enzymes and biosynthetic pathways by tapping into the biodiversity 
present in nature. In one aspect, large, complex (>109 member) gene libraries are constructed by direct 
isolation of DNA from selected microenvironments around the world. These libraries are then expressed in 
various host systems and subjected to high throughput screens specific for an activity of interest. 

Because in excess of 5000 different microbial genomes may be present in a single DNA library, ultra high 
throughput methods are required to effectively screen this diversity and are crucial to the success of this 
culture-independent, recombinant strategy. 

[000315] The invention provides screening platforms and methods for use with a Fluorescence Activated 
Cell Sorter (FACS). In FACS methodologies, cells are mixed with substrates and then streamed past a 
detector to screen for a positive molecular event. This signal could be a fluorescent signal resulting fi-om 
the cleavage of an enzyme substrate or a specific binding event. The greatest advantage of the use of a 
FACS machine is throughput; up to 109 clones can be screened per day. Unfortunately, FACS based 
screening also has limitations including cell wall permeability of enzymes and substrates/products and 
incubation times and temperatures. In addition, viability of host cells post-sort and dependence on a single 
data point for each individual cell further limit such technologies. 

[000316] The development of the capillary array overcomes many of these shortcomings. Like microtiter 
and solid phase screens, it combines the preservation of native protein conformation with increased signal 
strength of clonal amplification. The throughput, however, approaches that of selective assays and FACS- 
based assays. Moreover, as array plates are reusable, the amount of plastic waste generated is greatly 
reduced. Approximately 24 tons of plastic waste* is generated annually in screening 100,000 wells per day 
in a 96 well format (* Assuming 84g/plate x 1000 plates/day x 260 days/year). Further, a typical screen of 
100,000 wells on a robotic high throughput screening system requires 261 384-well microtiter plates and 
over 24 hours of equipment time versus less than 10 minutes to process a single plate. The enhancement of 
this technology to densities of one million wells per plate is aimed at approaching the throughput of 
selective assays and FACS -based assays while retaining the advantages of a microtiter-based screen. 

[000317] The first generation capillary array plates can be fabricated using manufacturing techniques 
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originally developed for the fiber optics industry, currently consist of 100,000 cylindrical compartments or 
wells contained within a 3. 3" x 5"reusable plate, the size of a SBS (Society for Biomolecular Screening) 
standard 96 well microtiter plate. These wells are 200 urn in diameter (about the diameter of a human hair) 
and act as discrete 250 nanoliter volume microenvironments in which isolated clones can be grown and 
screened. 

[0003 1 8] The processes involved in array screening closely parallel those in microtiter plate screening, but 
with significant simplification in required instrumentation and decrease in plate storage capacity 
requirements and reagent costs. Briefly, the plates are filled with clones and reagents (e. g. fluorescent 
substrate, growth media, etc. ) by surface tension, filling all 100,000 wells simultaneously within a few 
seconds without the need for complicated dispensing equipment. The number of clones per well, typically 1 
to 10, is adjusted by dilution of the cell culture. Once filled, the plates are then incubated in a humidity- 
controlled environment for 24 to 48 hours to allow for both clonal amplification and enzymatic turnover. 

[000319] After incubation in a humidified chamber, the plates are transferred to the detection and recovery 
station where fluorescence imaging is used to detect the expression of bioactive molecules. The automated 
detection and recovery system combines fluorescence imaging and precision motion control technologies 
through the use of machine vision and image processing techniques. Images are generated by focusing light 
fi-om a broadband light source (e. g. metal halide arc lamp) onto the plate through a set of fluorescence 
excitation filters. The resulting fluorescence emission is filtered then imaged by a telecentric lens onto a 
high-resolution cooled CCD camera in an epi-fluorescent configuration. The plates are scanned to generate 
a total of 56 slightly overlapping images in approximately one minute. 

The images are digitized and processed on-the-fly to detect and locate positive wells or putative hits. 
Putative hits (clones that have converted the substrate to a fluorescent product) appear as bright spots on a 
dark background. They are distinguished fi-om background fluorescence and extraneous signals (typically 
due to dirt and dust) based on a variety of feature measurements such as their shape, size, and intensity 
profile. 

[000320] Once detected and located, putative hits are recovered from the array plate and transferred to a 
standard microtiter plate for confirmation and secondary screening. The process of recovery consists of 1) 
mounting and locating a sterile recovery needle (typically a standard blunt end stainless steel needle 
commonly used for dispensing adhesives for mounting miniature surface mount electronic components), 2) 
aligning the recovery needle to the well containing the putative hit, 3) aspirating the contents of the well 
into the needle (which has attached. 22 micron filter to avoid upstream contamination and loosing the 
sample), 4) flushing the well contents into a standard microtiter plate with an appropriate media, and finally 
5) stripping off the recovery needle in preparation for the next recovery. 

Closed loop positioning with image-based feedback provides the positional accuracy required to allow 
aspiration of individual wells without contamination fi-om neighboring wells. 

Finally, after the clones of interest have been recovered, the used plates are cleaned, sterilized, and 
prepared for re-use. The array platform according to the invention will accelerate the discovery and 
development of commercial products as well as enable the development of products that would otherwise 
be unobtainable. 

[000321] This invention can be configured for use with a Fluorescence Activated Cell Sorter (FACS). In 
FACS methodologies, cells are mixed with substrates and then streamed past a detector to screen for a 
positive molecular event. This signal could be a fluorescent signal resulting from the cleavage of an 
enzyme substrate or a specific binding event. The greatest advantage of the use of a FACS machine is 
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throughput ; up to 109 clones can be screened/day. Unfortunately, FACS based screening also has 
limitations including cell wall permeability of enzymes and substrates/products and incubation times and 
temperatures. In addition, viability of host cells post-sort and dependence on a single data point for each 
individual cell further limit such technologies. 

[000322] The well diameter, plate thickness (well depth), and material optical properties will be specified 
prior to fabricating the new 1,000, 000-well density matrices. Once these parameters are specified, high 
density matrices will be fabricated in rectangular pieces approximately 1cm square. The process entails a 
low-risk modification to the same basic fabrication technique that is used to make the 100, 000 well plates. 
The array density can be calculated by using the following formula : <BR> <BR> <BR> <BR> <BR> 2 
(PlateLength # PlateWidth)<BR> <BR> &num;WellsPerPlate <BR> =#3 (WellDiameter + 
WellSeparationWall)^ [000323] This calculation reveals that in order to achieve, 1,000,000 wells in the 
standard 3. 3" x 5"microtiter plate format, the new wells will need to have a diameter of approximately 70 
um with 25Lm separating walls. Structures of this size/density and smaller (down to 6pm) are commonly 
manufactured for non-biological uses including micro-channel faceplates for intensified CCD cameras, X- 
ray scintillation plates, optical collimators, as well as simple fluid filters. 

[000324] There are some limitations to the depth of the wells due to the nature of the fabrication process. 
The current 100,000-well plates have 8mm deep wells. Based on our experience with structures of similar 
size, it is estimated that the depth of the 7011m wells will be between 5mm and 8mm. This yields a well 
volume of approximately 25nl to 30nl or approximately 1/1 0th of that of the 200tm diameter wells. 
Evaporation rate is a function of the surface area to volume ratio rather than the total volume. For this 
reason it is anticipated that the 70jam wells will experience comparable (if not less) evaporation than the 
200pm well due to a more favorable length to diameter (volume to surface area) ratio. Evaporation is 
currently not a problem with the 200pm diameter wells. 

[000325] Samples can be constructed from both transparent and opaque materials to evaluate illumination 
efficiencies, well-to-well optical cross-talk, surface-finish effects, and background fluorescence. The 
current 100, 000-well plates use an opaque material. The use of transparent materials improves the 
efficiency of fluorescence excitation at tiie expense of increased well-to-well optical cross-talk. For assays 
with low hit rates, the tradeoff may favor the use of transparent materials to improve detection sensitivity. 
We estimate that the specification and manufacturing process will take two months. A special holder will 
also be fabricated to adapt the matrices to the capillary array hardware. Once the specified matrices are 
manufactured, they will be tested for each of the optical and mechanical properties detailed below: 
[000326] Background Fluorescence-It is helpful from an imaging and processing perspective, but not 
critical, that the matrix have low background fluorescence for a broad range of excitation wavelengths to 
allow use with a variety of substrates. The materials used in the 20011m plates were tested and selected to 
satisfy this requirement. In the imlikely event that different materials must be used to fabricate both 
transparent and opaque 701 Im matrices, they will be tested for their fluorescent properties prior to 
fabrication. These tests are performed by measuring and comparing the fluorescence of the material to a 
reference standard at a range of excitation wavelengths. 10003271 Optical Efficiency-The 100, 000-well 
plates are currently illuminated by a roughly coUimated beam directly on the face of the plate. Light enters 
each well through the aperture formed by the wall around the well. Transparent materials are expected offer 
illumination advantages over opaque materials with the current illumination system by transmitting 
additional excitation energy through the walls separating the wells. The optical efficiency of the 1,000, 
000-well density matrices will be evaluated by determining the detectable concentration of a fluorescein 
solution. Typically, liquid phase enzyme discovery assays use 10-100, uM concentrations of fluorescent 
substrate. The current detection system can detect approximately lOnM of fluorescein in the 200um wells. 
The equivalent fluorescence of LB (our typical cell growth media) is approximately 25nM. Hardware 
modifications described in Goal 3 may be required in the unlikely event that the detectable levels are less 
than 10, uM for the new matrices. 



file://C:\My%20Documents\WIPO\WO-05-0 1 0 1 69.html 



9/14/06 



Page 65 of 123 



[000328] Optical Cross-talk- While the use of transparent materials may improve the efficiency of 
fluorescence excitation as described above, it does so at the expense of increased well-to-well optical cross- 
talk. This optical cross-talk is due to fluorescence emission that leaks from one well into its neighbors. This 
is easily quantified by, spotting a fluorophore onto the matrix, and then measuring the signal intensity vs. 
distance from a fluorophore filled well. The cross-talk could potentially mask the signal of a weak positive 
well resulting in a false negative or be detected as a false positive. In applications where the expected hit 
rate is low (which is commonly the ease with enzyme discovery from environmental libraries) the 
probability of this occurring is generally insignificant. However, cross-talk can compUcate the image 
processing required to automatically locate putative hits and therefore must be evaluated. 

[000329] Surface TensionAVicking Properties-The plates are filled by placing the surface of the plate in 
contact with the assay solution. Surface tension at the liquid/plate interface causes the assay components to 
be drawn or wick into all of the wells simultaneously. The surface preparation of the plate can have 
significant affects on the wicking properties of the matrix. Some surface polishing techniques have been 
found to make the glass face of the plate hydrophobic, thus preventing or significantly slowing the filling of 
the plate. Initially, the same surface finish currently used on the 100, 000-well plate will be tested. If 
necessary, matrices with different surface preparations may will be placed into contact with a celL/media 
mixture and their wicking properties quantified by timing the filling process and weighing the matrices 
before and after filling. In the event that plate filling remains inadequate after testing available surface 
preparations and treatments, surfactants can be added to improve filling. 

Resistance to Cleaning and Sterilization [000330] It may be desirable for the 1,000, 000-well plates to be 
reusable. To vaUdate this requirement, the matrices will be processed through multiple, rigorous cleaning 
and sterilization protocols. Currently, there is a great deal of latitude in both the cleaning and sterilization 
protocols. Cleaning can consist of a combination of flushing, soaking, and/or sonication in water, solvents 
and/or soaps: Likewise, due to the inherent ruggedness of the materials used, sterilization can be 
accomplished by autoclaving, bleach, ethanol, and/or acid washing. Cleanliness is verified by fluorescence 
imaging of the material at multiple excitation wavelengths. Sterilization is verified by ovemight incubation 
of matrices filled with sterile growth media, followed by plating the contents onto agar and looking for 
colony formation. 

[000331] Only minimal modifications to the detection system hardware will be required for the 1,000, 000- 
well density matrices. Due to reduced size of the wells, minor modifications to the optical system may need 
to be made to adjust the magnification to an appropriate level to determine screening feasibility. The optical 
system will likely need further modification as proposed in Phase II to enable automated hit recovery. A 
commercially available 2x extender can be added to the existing telecentric imaging lens used for the 
current 100, 000-well plate. This modification will render the final image size of each well (relative to the 
camera) approximately 70% of the current size. Based on our experience, this should be more than 
adequate to visualize positive wells for determining feasibility. 

[000332] As mentioned above, the detection sensitivity of the new matrices is expected to be lower 
(especially for opaque matrices) than for the current plates using the current detection system hardware. In 
addition to the use of transparent matrices, a number of hardware enhancements that could significantly 
improve sensitivity including: Higher sensitivity cooled CCD camera ; Laser based illumination or other 
higher power density light source; and Faster (possibly non-telecentric) imaging optics. 

[000333] In order to fully take advantage of the throughput afforded by 1,000, 000 well plates, a large 
number of imique clones must be generated. Two altemative methods for preparing large numbers (107 to 
109) of clones per day for screening can be used with the 100,000-well plates. They will both be tested for 
use with the 1,000, 000-well density matrices and are described below. One effort will use Resorufin B-D- 
galactopyranoside (Molecular Probes &num;R-l 159) as the fluorescent substrate and a positive p- 
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galactosidase control clone (535-GL2) for both assay development and feasibility screening. This substrate 
and positive clone were well characterized and validated during the development of the 100,000-well 
platform. 

[000334] Method 1: Screening Lambda Phage Libraries for Enzymatic Activity-Gene libraries cloned into 
lambda-based vectors are first titered by plating dilutions on soft agar in the presence of an appropriate E. 
coli host strain according to standard techniques. Using this titer information, an adequate amount of the 
lambda library is allowed to adsorb to the host. 

After 15 minutes, a mixture of growth medium and fluorescent substrate is then added to produce a final 
suspension having the following characteristics: [1] a density of host cells that will allow both sufficient 
growth and an effective multiplicity of infection, [2] an optimal concentration of fluorescent substrate for 
detection of the enzymatic activity, and [3] a density of phage particles such that, when loaded into a 1,000, 
000-well density matrix, each well will contain an average of 1-4 library clones. (Densities of 5-10 clones 
per well will be attempted once the initial details are worked out. ) A sample of this suspension is plated on 
soft agar to determine the average seed density of library clones (concomitant titer). The remainder of the 
suspension is used to load the wells of the matrices. The plates are incubated at 37°C for 16-24 hours 
(protected from light and evaporative loss; see note on Incubation below) to allow lytic multiplication of 
bacteriophage in the wells prior to detection and recovery. 

[000335] Method 2: Screening Phagemid and Other Colony-Based Libraries for Enzymatic Activity- 
Phagemid libraries are produced fi-om parental bacteriophage libraries using an in vivo excision process 
(Short et al. , 1988). Following initial titering, these libraries are used to infect an appropriate E. coli host 
strain. After the 15-minute adsorption period, cells are supplied with a small amount of medium and 
allowed to grow at 30 degrees Celsius without antibiotic selection for 45 minutes to allow expression of the 
antibiotic resistance gene present on the phagemid. The suspension is then plated onto solid plates 
containing antibiotic and allowed to grow at 30 degrees Celsius ovemight. Amplified clones fi-om the 
resulting antibiotic-resistant colonies are collected into a pooled suspension. A mixture of antibiotic, 
fluorescent substrate and growth medium is then added to produce the final suspension used to load the 
high-density matrices (with characteristics analogous to [2] and [3] above). A sample of this suspension is 
also plated onto solid agar plates containing antibiotic to determine the average seed density of library 
clones (concomitant titer). The matrices are then incubated at 30-37 degrees C for 1-2 days (protected fi"om 
light and evaporative loss; see note on Incubation below) to allow phagemid-containing host cells to 
multiply within the wells prior to detection and recovery. 

[000336] Libraries created in other vectors (e. g, cosmid, fosmid, PAC, YAC, BAC, etc.) are also screened 
using this platform. Factors such as grov^h requirements, transformation modality, and transformation 
efficiency have to be taken into consideration when adapting a particular library vector to this technology. 
The use of a variety of library and vector types permits screening for small molecules and protein 
therapeutics in addition to novel en2ymes. 

[000337] The array plates are typically incubated in a humidified incubator at 90% relative humidity for 24 
to 48 hours. The plates are stackable and designed such that each plate is contained within a humidity and 
temperature stable environment by the plates above and below it. Lids or extra plates filled with water are 
used at the top and bottom of each stack to seal the end plates. The incubation process requires validation 
of cell growth, evaporation, and condensation. 

[000338] The growth of E. coli, which will be used as the en2yme screening host, has been clearly 
demonstrated in the 100,000 well array plate. Other types of cells including Streptomyces, mammalian 
(Jurkat human leukemic T cells), and lambda phage have also been shown to grow in this format. Cell 
growth in the 1,000, 000-well density matrices will be verified by the same procedure used in for the 100, 
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000-well plates. The number of colonies formed by plating the initial cell solution (diluted to 1 to 10 
clones/well) will be compared to a culture of equal volume aspirated from the matrix after incubation. 
Although difficulties in cell growth are not anticipated, there are alternative strategies to mitigate these 
difficulties. The surface area to volume ratio of the 1,000, OOO-well density matrices is less favorable for 
oxygen diffusion into the assay solution than in the 100, OOO-well format. If oxygen diffusion appears to be 
limiting cell growth, we will evaluate methods for increasing oxygenation. Preliminary experiments have 
successfully demonstrated fluidic mixing in 200, um diameter wells using paramagnetic beads in a 
fluctuating magnetic field and by agitation with sound pulses. Magnetic mixing has been shown to vastly 
improve the growth of Streptomyces in the 1 00,000-well format. 

[000339] If necessary, these mixing methods could be ernployed to improve oxygen diffusion and cell 
growth. Other methods include oxygen saturation of the assay solution prior to plate filling, incubation in a 
high oxygen environment, and the addition of time- released oxygen generating compounds such as sodium 
percarbonate. With a total assay volume of approximately 30nl, controlling evaporation from the 1,000, 
OOO-well plates will be critical. However, as mentioned above, the surface to volume ratio is favorable for 
minimizing evaporation. Evaporation studies conducted in 1 00,000-well plates indicate a 10% loss of 
media volume over 24 hours. This loss is reduced to 5% with the addition of 10% glycerol. Because the 
surface area to volume ratio of the 1,000, OOO-well plates will be similar (if not more favorable) to the 100, 
OOO-well plates. Evaporation in the higher density matrices will be measured by filling the plates with 
typical assay media and weighing them at several time points over a 96-hour period. If stricter evaporation 
control is required, glycerol can be added. 

[000340] The effects of condensation/moisture on the sxirface of the matrices are also considered. Because 
they are incubated in high-humidity environments, droplets on the outer surfaces of the matrices that 
remain after filling or condense during incubation may not evaporate and can cause well to well cross- 
contamination. These droplets can lead to the detection of false positives in wells neighboring a true 
positive as well as cause a blotchy appearance on the plate surface that obscxires weak positives. Such 
problems with surface droplets remaining after filling the 1 00,000-well plates are avoided by letting them 
sit at room temperature until all of the surface moisture has evaporated. Avoiding condensation during 
incubation is accomplished by using strict temperature and humidity control. This issue is addressed by 
placing the filled plates in a programmable humidified chamber that starts with low humidity and increases 
it to the desired incubation humidity only after the plates have warmed to the chamber temperature. Once 
warm, the stacked plates form a relatively stable thermal mass immune to the small temperature 
fluctuations in the chamber. 

Surface moisture control issues will be similar in the higher density plates. The matrices will be tested to 
see if these methods successfully control surface moisture. 

[000341] Negative libraries spiked with the positive (3-gal clone at a defined frequency will be the first 
subjects of a feasibility screen. The same screen will be performed in parallel in a conventional microtiter 
format for comparison. Once this is proven, screening will proceed (again in parallel with microtiter 
format) to libraries known to contain positive clones. A mixed population library was validated for this 
purpose during the development of the 1 00,000-well platform and will be used for the 1,000, OOO-well 
feasibility screening. 

These experiments will be performed for both lambda-based and phagemid-based library screens since 
clonal amplification rates, and thus signal intensities, may differ between bacteriophage and whole cell 
assays. 

[000342] Validation of the feasibility screens can be performed by simply comparing the number of 
positive wells in the fluorescence images of the 1, 000, OOO-well matrices to those in a 100, OOO-well array 
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plate filled with the identical assay solution. 

[000343] Further verification will be done in standard microtiter format. The number of positive wells is a 
function of the concentration of positive clones in the initial assay solution and the volume of the wells. 
Since the well volume of the 1,000, 000-well matrices is approximately 1/lOth that of the 100,000 well 
plates, the expected number of positive wells should also be about 1/1 Oth when loading the same initial 
assay solution. 

[000344] The array of capillaries can be arranged to fit within a footprint of a microtiter plate, one standard 
of which is a footprint of 3. 3" x 5". Within that footprint, up to 1,000, 000 or more capillaries, or wells, can 
be provided in the array. A 1,000, 000 well platform for screening gene libraries from mixed populations of 
organisms for novel enzymatic activities provides an ultra high-throughput screening platform in the 3. 3" x 
5"footprint of a standard microtiter plate. In this format each well includes a capillary having a diameter of 
20011m, and which holds 250nl. The array platform permits rapid screening of genes and gene pathways, 
and increases the productivity of discovery and gene optimization programs for products such as novel 
enzymes, protein therapeutics, compoundis and small molecule drugs. 

Any number of novel enzymes of various catalytic classes (e. g., amylases, proteases, secondary amidase) 
can be discovered using the array platform. The same proprietary cost effective process by which the 
100,000-well plates are made can be utilized to make the 1,000, 000-well plates for smaller, non-biological 
applications. 

[000345] The array screening platform greatly expands the amount of molecular diversity that can be 
screened to discover new products. Using 1,000, 000-well plates, employing over 12,000 wells per square 
centimeter, more than one billion clones per day can be screened using standard liquid phase fluorescent 
assays, while at the same time reducing equipment and operator time through massively parallel dispensing 
and reading of biological samples. Additionally, the 1,000, 000-well plates, with wells each about half the 
diameter of a human hair, are be reusable and require only miniscule volumes of reagents, making them 
highly cost effective and environmentally responsible. 

[000346] Increasing the liquid phase screening density from 100,000 to 1,000, 000 wells per microtiter 
plate footprint represents a lOx increase in density that contributes to accelerated discovery and 
development of commercial products, such as antibody and protein therapeutic programs that require rapid 
screening of very large numbers of antibody and protein variants created by evolution technologies. This 
invention includes the design and fabrication of lem square matrices with 1,000, 000 well/plate density (i. 
e. 12,000 wells/cm2) using a process that is scalable to full microtiter plate sized arrays. 

[000347] The platform can be utilized to develop a novel liquid phase nitrilase assay in the 1, pOO,000-well 
format, as well as screening gene libraries from mixed populations of organisms for chiral nitrilases for use 
in the manufacture of chemical intermediates for chiral therapeutic compounds. 

[000348] In one aspect, the invention uses naked biopanning for direct screening or enrichment for a gene 
or gene cluster from environmental genomic DNA. The enrichment for or isolation of the desired genomic 
DNA is performed prior to any cloning, gene-specific PCR or any other procedure that may introduce 
unwanted bias affecting downstream processing and applications due to toxicity or other issues. Several 
methodologies can be described for this type of sequence based discovery. These generally include the use 
of nucleic acid probe (s) that is (are) partially or completely homologous to the target sequence in 
conjunction with the binding of the probe-target complex to a solid phase support. The probe (s) may be 
polynucleotide or modified nucleic acid, such as peptide nucleic acid (PNA) and may be used with other 
facilitating elements such as proteins or additional nucleic acids in the capture of target DNA. An 
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amplification step which does not introduce sequence bias may be used to ensure adequate yield for 
downstream applications. 

[000349] An example of a Naked Biopanning approach can be found in the use of RecA protein and a 
complement-stabilized D-loop (csD-loop) structure (Jayasena & Johnston, 1993 ; Sena and Zarling, 
1993) to target genomic DNA of interest. It does not involve complete denaturation of the target DNA and 
therefore is of particular interest when one is attempting to capture large genomic fragments. The following 
method incorporates the ClonCaptureTM cDNA selection procedure (CLONTECH Laboratories, Inc. ), 
with some modification, to take advantage of csD-loop formation, a stable structure which may be used to 
capture genomic DNA containing an internal target sequence. 

[000350] Environmental genomic DNA is cleaved into fragments (fragment size depends upon type of 
target and desired downstream insert size if making a pre-enriched library) using mechanical shearing or 
restriction digest. Fragments are size selected according to desired length and purified. A biotinylated 
dsDNA probe is produced, based upon existing knowledge of conserved regions within the target, by PCR 
from a positive clone or by synthetic means. The probe can be internally (ex. incorporation of biotin 21- 
dCTP) or end labeled with biotin. It must be purified to remove any unincorporated biotin. 

The probe is heat denatured (5 min. at 95°C) and placed immediately on ice. The denatured probe is then 
reacted with RecA and an ATP mix containing ATP and a nonhydrolyzable analog (15 min. at 37°C). The 
target DNA is added and incubated with the RecATbiotinylated probe nucleofilaments to form the csD-loop 
structure (20 min. at 37''C). The RecA is then removed by treatment with proteinase K and SDS. After 
inactivating the proteinase K with PMSF, washed and blocked (with sonicated salmon sperm DNA) 
streptavidin paramagnetic beads are transferred to the reaction and incubated to bind the csD-loop complex 
to the support (rotate 30 liiin. at room temp. ). The unbound DNA is removed and may be saved for use as 
target for a different probe. The beads are thoroughly washed and the enriched population is eluted using an 
alkaline buffer and transferred off. The enriched DNA is then ethanol precipitated and is ready for ligation 
and pre-enriched library preparation. 

[000351] Other stable complexes may be used instead of the RecA/csD-loop structure for the capture of 
genomic DNA. For instance, PNAs may be used, either as"openers"to allow insertion of a probe into 
dsDNA (Bukanov et al., 1998), or as tandem probes themselves (Lohse et al. , 1999). In the first case, 
PNAs bind to two short tracts of homopurines that are in close proximity to each other. They form P-loop 
structures, which displace the unboimd strand and make it available for binding by a probe, which can then 
be used to capture the target using an affinity capture method involving a solid phase. Likewise, PNAs may 
be used in a*'double-duplex invasion"to form a stable complex and allow target recovery. 

[000352] Simpler methods may be used in the retrieval of targets from environmental genomic DNA that 
involve complete denaturation of the DNA fragments. After cutting genomic DNA into fragments of the 
desired length via mechanical shearing or through the use of restriction enzymes, the target DNA may be 
bound to a solid phase using a direct hybridization affinity capture scheme. A nucleic acid probe is 
covalently boimd to a solid phase such as a glass slide, paramagnetic bead, or any type of matrix in a 
column, and the denatured target DNA is allowed to hybridize to it. The unbound fraction may be collected ' 
and re-hybridized to the same probe to ensure a more complete recovery, or to a host of different probes, as 
a part of a cascade scenario, where a population of environmental genomic DNA is subsequently panned 
for a number of different genes or gene clusters. 

[000353] Linkers containing restriction sites and sites for common primers may be added to the ends of the 
genomic fragments using sticky-ended or blunt-ended hgations (depending upon the method used for 
cutting the genomic DNA). These enable one to amplify the size-selected inserted fragment population by 
PCR without significant sequence bias. Thus, after using any of the abovementioned techniques for 
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isolation or enrichment, one may help to ensure adequate recovery for downstream processing. 
Furthermore, the recovered population is ready for cutting and ligation into a suitable vector as well as 
containing the priming sites for sequencing at any time. 

[000354] A variation of the above scheme involves including a tag from a combinatorial synthesis of 
pol5aiucleotide tags (Brenner et al., 1999) within the linker that is attached onto the ends of the genomic 
fragments. This allows each fragment within the starting population to have its own unique tag. Therefore, 
when amplified with common primers, each of these uniquely tagged fragments give rise to a multitude of 
in vitro clones which are then bound to the paramagnetic bead containing millions of copies of the 
complementary, covalently bound anti-tag. A fluorescently labeled, target specific probe may be 
subsequently hybridized to the target-containing beads. The beads may be sorted using FACS, where the 
positives may be sequenced directly from the beads and the insert may be cut out and ligated into the 
desired vector for further processing. The negative population may be hybridized with other probes and 
resorted as part of the cascade scenario previously described. 

[000355] Transposon technology may allow the insertion of environmental genomic DNA into a host 
genome through the use of transposomes (Goryshin & Reznikoff, 1998) to avoid bias resulting from 
expression of toxic genes. The host cells are then cultured to provide more copies of target DNA for 
discovery, isolation, and downstream processes. 

CONTAINMENT DEVICES [000356] In one aspect, the invention comprises methods and systems for 
maintaining and identifying cells using containment devices. In one aspect, the methods and systems 
comprise identifying and/or isolating a biomolecule or bioactivity of interest. The microenvironments used 
in this invention may be enclosed, or contained, in any containment device suitable for holding 
microenvironments. The containment device may be a petri or culture dish, a test tube, a porous tubing, a 
jar, a bottle, a flask, or a column or other cylindrical device. The containment device can also have a first 
port for influx of media and/or nutrients or amendments, and a second port for efflux of media and or 
nutrients and amendments, a collection device for collection of waste products, , and at least one filter to 
prevent loss of the microenvironments from the containment device. In one embodiment, the influx port is 
located at the bottom of the containment device and the efflux port is positioned at the top of the 
containment device, such that the media constantly flows through the containment device. In a preferred 
embodiment, the system also comprises a device (including any means) for causing media and/or nutrients 
and amendments to flow to facilitate movement of media through the microenvironments contained therein, 
e. g. , the system facilitates movement of nutrients, e. g. , aqueous media, gases, etc. , e. g. , in an upward or 
circular direction within the containment device. An exemplary devices may be a pump or equivalent. 

[000357] The containment device may be comprised of glass, plastic, metal or metal- like substances, or 
any non-reactive substance. One skilled in the art would understand that any liquid-tight device is suitable 
for practicing the invention. The containment device may be held in a stationary, movable, or other 
configuration capable of movement, such as by shaking. While particular embodiments have been provided 
by way of example, those skilled in the art would understand that additional embodiments may be used to 
practice the invention, 

[000358] The containment device can be on a large or small scale, for example, in one aspect the device 
can be manually moved in toto to an environmental control changer, e. g. , a cold room, incubator, heat 
chamber and the like. 

[000359] In one aspect, containment device is configured to eliminate or minimize "dead spots"or comers, 
thus, substantial numbers of the microenvironments in the contairmient device are in contact with nutrient 
flowthorough. 
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LASER MICROSCOPY [000360] In one aspect, the systems and methods of the present invention use a 
laser microscope to visualize the cell (s) contained in a microenvironment. The laser microscope may be 
used to select the desired microenvironment from among other microenvironments, e. g, , when they are 
placed on a substrate, such as a glass slide. Laser microscopes, laser- capture microdissection, and laser- 
pressure catapults are known in the art, and the systemis and methods of the present invention can 
incorporate any known system or method of using; for example, using a laser positioned on a microscope 
for use in microdissection and/or catapulting desired samples directly from a substrate. In one embodiment, 
the system is comprised of a PALMTM brand laser (P. A. L. M Mikrolaser Technologic) and/or a Zeiss 
Axiovert200 microscope (C£irl Zeiss, Inc.). 

[000361] In one aspect, the system and method of the present invention applies microenvironments to 
slides for use with the laser microscope. The microenvironments may be fluorescence activated cell sorted 
directly onto a slide or may be manually applied by diluting the microenvironments in buffer or media and 
a pipette tip is submerged into the diluted microcapsule solution and manually spotted onto the slide by 
touching the slide briefly with the tip. In a preferred embodiment, a monolayer of microenvironments is 
applied to the slide. 

[000362] In one aspect, media for use in containment devices (e. g., growth columns) containing 
microenvironments containing cells from a soil sample can be as follows. Even though the open ocean is 
stratified and heterogeneous, soil is even more so, and this is reflected in the diversity of the physiological 
and metabolic capabilities of the soil biota. 

Torsvik and co-workers used reassociation kinetics of single-stranded DNA to show that there is a 
considerable genetic diversity among the microorganisms that are found in soil (See for example, Torsvik, 
V. , Daae, F. L. , Sandaa, R. A. & Ovreas, L. , Novel techniques for analyzing microbial diversity in natural 
and perturbed environments. J. Biotechnol. 64,53-62 (1998) and Torsvik, V. , Salte, K. Sorheim, R. & 
Goksoyr, J. Comparison of phenotypic diversity and DNA heterogeneity in a population of soil bacteria, 
Appl. Environ. Microbiol. 

56,776-781 (1990). ). They estimated that a pasture soil sample contained about 3, 500-8,800 genome 
equivalents- This could result in approximately 10,000 different species of equivalent abundance (See 
Torsvik, V. , Ovreas, L. & Thingstad, T. G. Prokaryotic diversity -magnitude, dynamics, and controlling 
factors. Science 296,1064-1066 (2002). ) This immense diversity presents a challenge for researchers who 
are attempting to completely describe microbial communities. Even small changes in the soil environment 
seem to have a large influence on the overall species diversity and community structure. Some of the 
environmental factors that are known to influence community composition are heavy metals, (See Sandaa, 
R.etal. 

Analysis of bacterial communities in heavy -metal-contaminated soils at different levels of resolution, 
FEMSMicrobiol. Ecol. 30,237-251 (1999). ) organic contaminants (Stephen, J. R. et al. Microbial 
characterization of a JP-4 fuel-contaminated site using a combined lipid biomarker/polymerase chain 
reaction-denaturing gradient gel electrophoresis (PCR-DGGE) - based approach. Environ. Microbiol 1,231- 
241 (1999). ) and pesticides (el Fantroussi, S"., Verschuere, L. , Verstraete, W. & Top, E. M, Effect of 
phenylurea herbicides on soil microbial communities estimated by analysis of 16s rRNA gene fingerprints 
and community-level physiological profiles. Appl. Environ. Microbiol. 65,962-988 (1999). ) Physical 
parameters of the soil, such as particle size, permeability, porosity, water content, mineral composition and 
plant cover are all factors that can influence microbial composition. (See Sessitsch, A. 

Weilharter, A. Gerzabek, M. H. , Kirchmann, H. & Kandeler, E. Microbial population structures in soil 
particle size fi-actions of a long-term fertilizer field experiment. Appl. 
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Eizviroii. Microbiol. 67, 4215-4224 (2001); Girvan, M. S. BuUimore, J., Pretty, J. No, Osbom, A. M. 
& Ball, A. S. Soil type is the primary determinant of the composition total and active bacterial 
communities in arable soils. Appl. Environ. Microbiol. 69, 1800-1809 (2003); Kowalchuk, G. A., Buma, D. 
S., de Boer, W. , Klinkhamer, P. G. & van Veen, J. A. Effects of above-ground plant species composition 
and diversity on the diversity of soil-bome microorganisms. Antonie van Leeuwenhowek 81, 509-520 
(2002). ) Assessing the microbial diversity by itself using a snapshot approach-can therefore be misleading. 
Only analyses of the microbial diversity within defined spatial and temporal coordinates, in combination 
with measurements of geochemical and physical parameters, will allow investigators to link changes in the 
community composition with environmental factors. (See Keller, M. and Zengler, K., Tapping into 
Microbial Diversity, Nature Reviews, Vol. 2 (February 2004), hereby incorporated by reference in its 
entirety.) [000363] Without further elaboration, it is believed that one skilled in the art can, using the 
preceding description, utilize the present invention to its fullest extent. The following examples are to be 
considered illustrative and thus are not limiting of the remainder of the disclosure in any way whatsoever. 

EXAMPLES Example 1: DNA Isolation and Library Construction [000364] The following outlines 
exemplary procedures used to generate a gene library from a mixed population of organisms. 

[000365] DNA isolation. DNA is isolated using the IsoQuick Procedure as per manufacturer's instructions 
(Orca, Research Inc., Bothell, WA). DNA can be normalized according to Example 2 below. Upon 
isolation the DNA is sheared by pushing and pulling the DNA through a 25G double -hub needle and a 1-cc 
syringes about 500 times. A small amount is run on a 0.8% agarose gel to make sure the majority of the 
DNA is in the desired size range (about 3-6 kb). 

[000366] Blunt-ending DNA. The DNA is blunt-ended by mixing 45 ul of 1 OX Mimg Bean Buffer, 2.0 ul 
Mung Bean Nuclease (150 u/ul) and water to a final volume of 405 ul. 

The mixture is incubated at Sl^'C for 15 minutes. The mixture is phenol/chloroform extracted followed by 
an additional chloroform extraction. One ml of ice cold ethanol is added to the final extract to precipitate 
the DNA. The DNA is precipitated for 10 minutes on ice. The DNA is removed by centrifiagation in a 
microcentrifuge for 30 minutes. The pellet is washed with 1 ml of 70% ethanol and repelleted in the 
microcentrifuge. Following centrifiigation the DNA is dried and gently resuspended in 26 ul of TE buffer. 

[000367] Methylation of DNA. The DNA is methylated by mixing 4 ul of lOX EcoRI Methylase Buffer, 
0.5 ul SAM (32 mM), 5.0 ul EcoRI Methylase (40 u/ul) and incubating at 37°C, 1 hour. In order to insure 
blunt ends, add to the methylation reaction: 5.0 ul of 100 mM MgC12, 8.0 ul of dNTP mix (2.5 mM of each 
dGTP, dATP, dTTP, dCTP), 4.0 ul of Klenow (5 u/ul) and incubate at 12°C for 30 minutes. 

[000368] After 30 minutes add 450 ul IX STE. The mixture is phenol/chloroform extracted once followed 
by an additional chloroform extraction. One ml of ice cold ethanol is added to the final extract to precipitate 
the DNA, The DNA is precipitated for 10 minutes on ice. The DNA is removed by centrifiigation in a 
microcentrifiige for 30 minutes. The pellet is washed with 1 ml of 70% ethanol, repelleted in the 
microcentrifuge and allowed to dry for 10 minutes. 

[000369] Ligation. The DNA is ligated by gently resuspending the DNA in 8 ul EcoRI adaptors (fi-om 
Stratagene^s cDNA Synthesis Kit), 1.0 ul of lOX Ligation Buffer, 1.0 ul of 10 mM rATP, 1.0 ul of T4 DNA 
Ligase (4Wu/ul) and incubating at 4°C for 2 days. The ligation reaction is terminated by heating for 30 
minutes at 70'*C. 

[000370] Phosphorylation of adaptors. The adaptor ends are phosphorylated by mixing the Ugation reaction 
with 1.0 ul of lOX Ligation Buffer, 2.0 ul of lOmM rATP, 6.0 ul of H20, 1.0 ul of polynucleotide kinase 
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(PNK) and incubating at 2>TC for 30 minutes. After 30 minutes 3 1 ul H20 and 5 ml lOX STE are added to 
tiie reaction and the sample is size fractionate on a Sephacryl S-500 spin column. The pooled fractions (1- 
3) are phenol/chloroform extracted once followed by an additional chloroform extraction. The DNA is. . 
precipitated by the addition of ice cold ethanol on ice for 10 minutes. The precipitate is pelleted by 
centrifugation in a microfuge at high speed for 30 minutes. The resulting pellet is washed with 1 ml 70% 
ethanoi, repelleted by centrifugation and allowed to dry for 10 minutes. The sample is resuspended in 10.5 
ul TE buffer. Do not plate. Instead, hgate directly to lambda arms as-above except use 2.5 ul of DNS and 
no water. 

[000371] Sucrose Gradient (2.2 ml) Size Fractionation. Stop ligation by heating the sample to 65 for 10 
minutes. Gently load sample on 2.2 ml sucrose gradient and centrifiige in mini-ultracentrifuge at 45K, 20^*0 
for 4 hours (no brake). Collect fractions by puncturing the bottom of the gradient tube with a 20G needle 
and allowing the sucrose to flow through the needle. Collect the first 20 drops in a Falcon 2059 tube then 
collect 10 1-drop fractions (labeled 1-10). Each drop is about 60 ul in volume. Run 5 ul of each fraction on 
a 0.8% agarose gel to check the size. Pool fractions 1-4 (about 10-1.5 kb) and, in a separate tube, pool 
fractions 5-7 (about 5-0.5 kb). Add 1 ml ice cold ethanol to precipitate and place on ice for 10 minutes. 
Pellet the precipitate by centriftigation in a microfiige at high speed for 30 minutes. Wash the pellets by 
resuspending them in 1 ml 70% ethanol and repelleting them by centrifiigation in a microfiige at high speed 
for 10 minutes and dry. Resuspend each pellet in 10 ul of TE buffer. 

[000372] Test Ligation to Lambda Arms. Plate assay by spotting 0.5 ul of the sample on agarose containing 
ethidium bromide along with standards (DNA samples of known concentration) to get an approximate 
concentration. View the samples using UV light and estimate concentration compared to the standards. 
Fraction 1-4 = > 1.0 ug/uL Fraction 5-7 = 500 ng/ul. 

[000373] Prepare the following Ugation reactions (5 Ri reactions) and incubate 4''C, overnight: Sample 
H20 lOX Ligase lOmM Lambda Insert T4 DNA Buffer rATP arms DNA Ligase (4 (ZAP) Wu/(1) Fraction 
1-4 0.5 ul 0.5 ul 0.5 ul 1. 0 ul 2. 0 ul 0.5 ul Fraction5-7 0.5 ul 0. 5 ul 0. 5 ul 1. 0 ul 2. 0 ul 0. 5 ul [000374] 
Test Package and Plate. Package the ligation reactions following manufacturer's protocol. Stop packaging 
reactions with 500 ul SM buffer and pool packaging that came from the same ligation. Titer 1.0 ul of each 
pooled reaction on appropriate host (OD600 = 1.0) [XLI-Blue MRF]. Add 200 ul host (in mM MgS04) to 
Falcon 2059 tubes, inoculate with 1 ul packaged phage and incubate at 37°C for 15 minutes. Add about 3 
ml 48°C top agar [50ml stock containing 150 ul IPTG (0. 5M) and 300 ul X-GAL (350 mg/ml)] and plate 
on 100 mm plates. Incubate the plates at 37°C, overnight. 

[000375] Amplification of Libraries (5.0 x 105 recombinants from each library). Add 3.0 ml host cells 
(OD600=1. 0) to two 50 ml conical tube and inoculate with 2.5 X 105 pfii of phage per conical tube. 
Incubate at 37°C for 20 minutes. Add top agar to each tube to a final volume of 45 ml. Plate each tube 
across five 150 mm plates. Incubate the plates at 37''C for 6-8 hours or until plaques are about pin-head in 
size. Overlay the plates with 8-10 ml SM Buffer and place at 4''C ovemight (with gentle rocking if 
possible). 

[000376] Harvest Phage. Recover phage suspension by pouring the SM buffer off each plate into a 50-ml 
conical tube. Add 3 ml of chloroform, shake vigorously and incubate at room temperature for 15 minutes. 
Centrifiige the tubes at 2K rpm for 10 minutes to remove cell debris. Pour supematant into a sterile flask, 
add 500 ul chloroform and store at 4°C. 

[000377] Titer Amplified Library. Make serial dilutions of the harvested phage (for example, 10-5= 1 ul 
amplified phage in 1 ml SM Buffer ; 10-6= 1 ul of the 10-3 dilution in 1 ml SM Buffer). Add 200 ul host 
(in 10 mM MgS04) to two tubes. Inoculate one tube with 10 ul 10-6 dilution (10-5). Inoculate the other 
tube with 1 ul 10-6 dilution (10-6). Incubate at 37°C for 15 minutes. Add about 3 ml 48°C top agar [50ml 
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stock containing 150 ul IPTG (0. 5M) and 375 ul X-GAL (350 mg/ml) ] to each tube and plate on 100 mm 
plates. Incubate the plates at 37°C, ovemight. Excise the ZAP II library to create the pBLUESCRIPT 
library according to manufacturers protocols (Stratagene). 

Example 2 : Construction of a Stablee Large tll$ert Picoplantiton Genomic I : A Library [000378] Cell 
collection and preparation of DNA. Agarose plugs containing concentrated picoplankton cells were 
prepared from samples collected on an oceanographic cruise from Newport, Oregon to Honolulu, Hawaii. 
Seawater (30 liters) was collected in Niskin bottles, screened through 10 m Nitex, and concentrated by 
hollow fiber filtration (Amicon DCIO) through 30,000 MW cutoff polyfixlfone filters. The concentrated 
bacterioplankton cells were collected on a 0.22 m, 47 mm Durapore filter, and resuspended in 1 ml of 2X 
STE buffer (IM NaCl, O. IM EDTA, 10 mM Tris, pH 8. 0) to a final density of approximately 1 x 101^ 
cells per ml. The cell suspension was mixed with one volume of 1 % molten Seaplaque LMP agarose 
(FMC) cooled to 40 C, and then immediately drawn into a 1 ml syringe. The syringe was sealed with 
parafilm and placed on ice for 10 min. The cell- containing agarose plug was extruded into 10 ml of Lyses 
Buffer (10 mM Tris pH 8.0, 50 mM NaCI, 0.1 M EDTA, 1% Sarkosyl, 0.2% sodium deoxycholate, 1 
mg/ml lysozyme) and incubated at 37 C for one hour. The agarose plug was then transferred to 40 mis of 
ESP Buffer (1% Sarkosyl, 1 mg/ml proteinase K, in 0. 5M EDTA), and incubated at 55 C for 16 hours. The 
solution was decanted and replaced with fi-esh ESP Buffer, and incubated at 55 C for an additional hour. 
The agarose plugs were then placed in 50 mM EDTA and stored at 4 degrees Celsius shipboard for the 
duration of the oceanographic cruise. 

[000379] One slice of an agarose plug (72 1) prepared fi-om a sample collected off the Oregon coast was 
dialyzed ovemight at 4 degrees Celsius against 1 mL of buffer A (100 mM NaCl, 10 mM Bus Tris 
Propane-HCl, 100 g/ml acetylated BSA: pH 7.0 &commat; 25 degrees Celsius) in a 2 mL microcentrifiige 
tube. The solution was replaced with 250 1 of fi-esh buffer A containing 10 mM MgC 1, and 1 mh4 DTT 
and incubated on a rocking platform for 1 hr at room temperature. The solution was then changed to 250 1 
of the same buffer containing 4U of Sau3Al (NEB), equilibrated to 37 degrees Celsius in a water bath, and 
then incubated on a rocking platform in a 37 degrees Celsius incubator for 45 minutes. The plug was 
transferred to a 1.5 ml microcentrifuge tube and incubated at 68 degrees Celsius for 30 minutes to 
inactivate the enzyme and to melt the agarose. The agarose was digested and the DNA dephosphorylased 
using Gelase and HK-phosphatase (Epicentre), respectively, according to the manufacturer's 
recommendations. Protein was removed by gentle phenol/chloroform extraction and the DNA was ethanol 
precipitated, pelleted, and then washed with 70% ethanoL This partially digested DNA was resuspended in 
sterile H, 20 to a concentration of 2. 5ng/l for hgation to the pFOSI vector. 

10003C-001 PGR amplification results from several of the agarose plugs (data not shown) indicated the 
presence of significant amounts of archaeal DNA. Quantitative hybridization experiments using rRNA 
extracted from one sample, collected at 200 m of depth off the Oregon Coast, indicated that planktonic 
archaea in this assemblage comprised approximately 4.7% of the total picoplankton biomass. This sample 
corresponds to"PACl"-200 m in Table 1 of DeLong et al. (DeLong, 1994), which is incorporated herein by 
reference. Results from archaeal-biased rDNA PCR amplification performed on agarose plug lysates 
confirmed the presence of relatively large amounts of archaeal DNA in this sample. Agarose plugs 
prepared from this picoplankton sample were chosen for subsequent fosmid library preparation. Each 1 ml 
agarose plug from this site contained approximately 7.5 x 105 cells, therefore approximately 5.4 x 105 cells 
were present in the 72 1 slice used in the preparation of the partially digested DNA. 

[000381] Vector arms were prepared from pFOSI as described by Kim et al. (Kim, 1992). Briefly, the 
plasmid was completely digested with Astll, dephosphorylated with HK phosphatase, and then digested 
with BamHI to generate two arms, each of which contained a cos site in tiie proper orientation for cloning 
and packaging ligated DNA between 35-45 kbp. 
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The partially digested picoplankton DNA was ligated overnight to the PFOS 1 arms in a 15 1 ligation 
reaction containing 25 ng each of vector and insert and lU of T4 DNA ligase (Boehringer-Mannheim). The 
ligated DNA in four microliters of this reaction was in vitro packaged using the Gigapack XL packaging 
system (Stratagene), the fosmid particles transfected to E. coh strain DHIOB (BRL), and the cells spread 
onto LBcml5 plates. The resultant fosmid clones were picked into 96-well microliter dishes containing 
LBemis supplemented with 7% glycerol. Recombinant fosmids, each containing ca. 40 kb of picoplankton 
DNA insert, yielded a library of 3.552 fosmid clones, containing approximatelyl. 4 x 108 base pairs of 
cloned DNA. All of the clones examined contained inserts ranging from 38 to 42 kbp. This library was 
stored frozen at-80 degrees Celsius for later analysis. 

[000382] Numerous modifications and variations of the present invention are possible in light of the above 
teachings ; therefore, within the scope of the claims, the invention may be practiced other than as 
particularly described. 

Example 3: CsCl-Bisbenzimide Gradients Gradient visualization by UV: Visualize gradient by using the 
UV handlamp in the darkroom and mark bandings of the standard which will show the upper and lower 
limit of GC-contents. 

Harvesting of the gradients : 1. Connect Pharmacia-pump LKB PI with fraction collector (BIO-RAD 
model 2128). 

2. Set program: rack 3,5 drops (about 100 ul), all samples. 

3. Use 3 microtiter-dishes (Costar, 96 well cell culture cluster). 

4. Push yellow needle into bottom of the centrifuge tube. 

5. Start program and collect gradient. Don't collect first and last 1-2 ml depending on where your markers 
are. 

Dialysis L Follow microdialyzer instruction manual and use Spectra/Por CE Membrane MWCO 25,000 
(wash membrane with ddH20 before usage). 

2. Transfer samples from the microtiter dish into microdialyzer (Spectra/Por, 3. MicroDialyzer) with 
multipipette. (Fill dialyzer completely with TE, get rid of any air bubble, transfer samples very fast to avoid 
new air-bubbles). 

4. Dialyze against TE for 1 hr on a plate stirrer. 

DNA estifnation with PICOGREENTM 1. Transfer samples (volume after dialysis should be increased 1.5- 
2 times) with multipipette back into microtiter dish. 

2. Transfer 100 ul of the sample into Polytektronix plates. 

3. Add 100 ul Picogreen-solution (5 ul Picogreen-stock-solution + 995 ul TE buffer) to each sample. 

4. Use WPR-plate-reader. 

5. Estimate DNA concentration. 
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Example 4 : Bis-Benzimide Separation of Genomic DNA [0003$31 A sample composed of genomic DNA 
from Clostridium perfringens (27% GH-C), Escherichia coli (49% WC) and Micrococcus lysodictizi77i 
(72% G+C) was purified on a cesium-chloride gradient. The cesium chloride (Rf = 1. 3980) solution was 
filtered through a O. 2 m filter and 15 ml were loaded into a 35 ml OptiSeal tube (Beclctnan). The DNA 
was added and thoroughly mixed. Ten micrograms of bis-benzimide (Sigma ; Hoechst 33258) were added 
and mixed thoroughly. The tube was then filled with the filtered cesium chloride solution and spun in a 
VTi50 rotor in a Beckman L8-70 Ultracentrifuge at 33,000 rpm for 72 hours. Following centrifUgation, a 
syringe pump and firactionator (Brandel Model 186) were used to drive the gradient through an ISCO UA-5 
UV absorbance detector set to 280 nm. 

Three peaks representing the DNA from the three organisms were obtained. PGR amplification of DNA 
encoding rRNA fi-om a 10-fold dilution of the E coli peak was performed with the following primers to 
amplify eubacterial sequences: Forward primer: (27F) 5-AGAGTTTGATCCTGGCTCAG-3 (SEQ ID NO: 
1) Reverse primer: (1492R) 5-GGTTACGTTGTTACGACTT-3 (SEQ ID NO : 2) Example 5: 
FACS/Biopanning [000384] Infection of library lysates into Exp503 E. coli strain. 25 ml LB + Tet culture 
of Exp503 were cultured overnight at 37 C. The next day the culture was centrifuged at 4000 rpm for 10 
minutes and the supematant decanted. 20ml lOmM ]V[gS04 was added and the OD600 checked. Dilute to 
OD 1.0. 

[000385] In order to obtain a good representation of the library, at least 2-fold (and preferably 5-fold) of the 
library lysate titer was used. For example: Titer of library lysate is 2x106 cfu/ml. Need to plate at least 
4x106 cfu. Can plate approx. 500,000 microcolonies/ 150mm LB-Kan plate. Need 8 plates. Can plate 1 ml 
of reaction/plate-need 8 mis of cells + lysate. 

[000386] 2-fold (ex. 2 ml) of library lysate was mixed with appropriate amount (e. g. , 6 ml) of OD 1.0 
Exp503. The sample was incubated at 37°C for at least 1 hour. Plated 1 ml reaction on 150mm LB-Kan 
plate x 8 plates and incubated overnight at 30°C. Harvesting, induction, and fixing of library in Exp503 
cells. Scrape all cells from plates into 20 ml LB using a rubber policeman. Dilute cells approx. 1 : 100 (200 
ul cells/20 ml LB) and incubate at 37''C until culture is OD 0.3. Add 1 : 50 dilution of 20% sterile Glucose 
and incubate at 37°C until culture is OD 1. 0. Add 1 : 100 dilution of INI MgS04. Transfer 5 ml of culture 
to a fresh tube and the remaining culture can be used as an uninduced control if desired-or discarded. 

Add MOI 5 of CE6 bacteriophage to the remaining 5 ml of culture. (CE6 codes for T7 RNA Polymerase) 
(e. g. , OD 1 ==8x108 cells/ml x 5 ml = 4x109 cells x MOI 5 = 21olo bacteriophage needed). Incubate 
culture + CE6 for 2 hr at 37°C. Cool on ice and centrifuge cells at 4000 rpm for 10 min. Wash with 10 ml 
PBS. Fix cells in 600 ul PBS + 1.8 ml fresh, 'filtered 4% paraformaldehyde. Incubate on ice for 2 hrs. (4% 
Paraformaldehyde : Heat 8.25 ml PBS in flask at 65°C. Add 100 ul IM NaOH and 0.5 g paraformaldehyde 
(stored at 4°C.) Mix until dissolved. Add 4.15 ml PBS. Cool to OX. Adjust pH to 7.2 with 0.5 M 
NaH2P04. 

Cool to 0°C. Syringe filter. Use within 24 hrs). After fixing, centrifuge at 4000 rpm for 10 min. Resuspend 
in 1.8 ml PBS and 200 ul 0. 1% NP40. Store at 4*^0 overnight. 

[000387] Hybridization of fixed cells. Centrifuge fixed cells at 4000 rpm for 10 min. 

Resuspend in 1 ml 40 mM Tris pH7. 6/ 0. 2% NP40. Transfer 100 ul fixed cells to an Eppendorf tube. 
Centrifuge for 1 min and remove supematant. Resuspend each reaction in 50 ul Hybridization buffer (0.9 
M NaCI ; 20 mM Tris pH7. 4; 0.01% SDS ; 25% formamide- can be made in advance and stored at-20''C.). 
Add 0.5 nmol fluorescein-labeled primer to the appropriate reactions. Incubate with rocking at 46°C for 2 
hr. (Hybridization temperature may depend on sequence of primer and template. ) Add 1 ml wash buffer to 



file://C:\My %20Documents\WIPO\WO-05-0 1 0 1 69.html 



9/14/06 



Page 77 of 123 



each reaction, rinse briefly and centrifuge for 1 min. Discard supernatant. (Wash buffer: 0.9 M NaCI ; 20 
mM Tris pH 7. 4; 0. 01% SDS). Add another 1 ml of wash buffer to each reaction, and incubate at 48°C 
with rocking for 30 min. Centrifuge and remove supernatant. Visualize cells under microscope using WIB 
filter. 

[000388] FACS sorting. Dilute cells in 1 ml PBS. If cells are clumping, sonicate for 20 seconds at 1.5 
power. FAC sort the most highly fluorescent single-cells and collect in 0.5 ml PGR strip tubes 
(approximately one 96-well plate/library). PGR single-cells with vector specific primers to amplify the 
insert in each cell. Electrophorese all samples on an agarose gel and select samples with single inserts. 
These can be re-amphfied with Biotin-labeled primers, hybridized to insert-specific primers, and examined 
in an ELISA assay. Positive clones can then be sequenced. Altematively, the selected samples can be re- 
amplified with various combinations of insert-specific primers, or sequenced directly. 

Example 6 : Large Insert FAGS Biopanning Protocol In one aspect, the methods of the invention use Large 
Insert FACS Biopanning (LIFB), FISH or fluorescence detection, to identify microcolonies having cells 
which contain or express a specific nucleic acid. LIFB is an ultra-high throughput screening technology 
that is based on MicroGapsule (MiG) in situ hybridization. Fluorescence Activated Gell Sorting (FAGS) 
and Rolling Gircle Amplification (RGA). An exemplary protocol is described below. 

1. Encapsulate 1 vial of 3% home-made SeaPlaque gel. Each vial of gel can make 106 GMD. Take lOOul 
melt frozen fosmid PMF21/DH10B library, OD600 = 0.4 to encapsulate, centrifuge down to lOul. Melt 
agarose gel, add lOOul FBS (fetal bovine semm) and vortex. 

Place in 50 G water in a beaker. Add lOul culture, vortex and add to 17ml mineral oil. Shake for about 30 
times, place on the One Gell machine. Blend at 2600rpm Imin at room temperature and 2600rpm 9 minutes 
on ice. Wash with PBS twice. Resuspend in 10ml LB+ Apr50, shake at 37°G for 4 hours at 230 rpm. Gheck 
microscopically to see the growth and size of microcolonies. 

2. Gentrifuge at 1500rpm for 6 min. GMDs are resuspend in 5ml of 2xSSG and can be saved at 4 °G for 
several days. Take 200ul GMD in 2xSSG for each reaction. 

3. Resuspend in 10 ml 2xSSG/5% SDS. Incubate 10 min at RT shaking or rotating. 
Gentrifuge. 

4. Resuspend in 5 ml lysis solution containing proteinase K. Incubate 30 min at 37°G shaking or rotating. 
Gentrifuge. 

Lysis Solution: 50mM Tris pH8 0. 75ml IM Tris 50mM EDTA 1. 5ml 0. 5M EDTA lOOmM NaGl 300 ul 
5MNaGI 1% Sarkosyl 0. 75ml 20% Sarkosyl 250ug/ml Proteinase K 375^1 proteinase K stock (lOmg/ml) 
11. 325ml dH20 5. Resuspend in 5 ml denaturing solution. Incubate 30 min at RT shaking or rotating. 

Gentrifuge at 1500rpm for 5 min. 

Denaturing Solution: 0. 5M NaOH/1. 5M NaGl 6. Resuspend in 5 ml neutralizing solution. Incubate 30 min 
at RT shaking or rotating. 

Gentrifuge. 

Neutralizing Solution: 0. 5M Tris pH8/l. 5M NaGl 7. Wash in 2XSSG briefly. 
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8. Aliquot 200ul/RxN into microcentrifuge tubes, microcentrifuge and take out the 2XSSC. Add 130 
ul"DIG EASY HYB"to prehyb for 45 minutes at ST^'C. Do prehyb and hyb in Personal Hyb Oven. 

9. Aliquot oligo probe and denature at 85°C for 5 minutes, place on ice immediately. 

Add appropriate amount of probe (0. 5-lnmol/RXN) and return to rotating hyb. oven for O/N. 

10. Prepare a 1% % (1 Omg/ml) solution of Blocking Reagent in PBS. Store at 4°C for the day use. 

11. Wash GMD*s with 0. 8ml of 2XSSC/0. 1% SDS RT 15 min, rotating. At the meantime, prewarm next 
wash solution. 

12. Wash GMD's with 0. 8ml of 0. 5XSSC/0, 1% SDS 2xl5min at appropriate temp, rotating. If more 
stringency is required, the 2nd wash can be done in 0. lXSSC/0. 1% SDS. 

13. Wash with 0. 8ml/RXN 2XSSC briefly. 

14. Block the reaction w/130ul 1% Blocking Reagent in PBS at RT for 30 minutes. 

15. Add 1. 4ul anti-DIG-POD (so 1: 100) and incubate at RT for 3 hours. 

16. Wash GMDs w/0. 8ml PBS/RN 3x 7 minutes at 37°C. 

17. Prepare a tyramide working solution by diluting the tyramide stock solution 1: 85 in Amplification 
buffer/0.0015% H202. Apply 130ul tyramide working solution at RT and incubate in the dark at RT for 30 
minutes. 

18. Wash 3X for 7 min. in 0. 8ml PBS buffer &commat;37°C. 

19. Visualize by microscope and FACS sort. 

Example 7: Biopanning Protocol Preparing Insert DNA from the Lambda DNA PGR amplify inserts using 
vector specific primers CA98 and CA103. 

CA98: ACTTCCGGCTCGTATATTGTGTGG CA103: ACGACTCACTATAGGGCGAATTGGG These 
primers match perfectly to lambda ZAP Express clones (pBKC91). 

Reagents : Lambda DNA prepared from the libraries to be panned (Librarians) Roche Expand Long 
Template PGR System #1-759-060 Pharmacia dNTP mix &num;27-2094-01 or Roche PGR Nucleotide 
Mix (10 mM) #1-581-295 or Roche dNTFs-PGR grade #1-969-064 1. Make the insert amplification mix: 
X ul dH20 (final 50 gl) 5 |li1 lOx Expand Buffer #2 (22.5 mM MgC12) 0.5 or 0. 625 ^l dNTP mix (20 mM 
each dNTP) 10 ng (approx) lambda DNA per library (usually 1^1 or 1 111 1 : 10 dibi) 1-2 pal GA98 (100 
. ng/ul or 15 uM) 1-2 RI GA103 (100 ng/^tl or 15^M) 0. 5 ul Expand Long polymerase mix 2. PGR amplify: 
Robocycler 95°G 3 minute x 1 cycle 95'*G 1 minute 65'^G 45 seconds x 30 cycles 68°G 8 minute 68°G 8 
minute x 1 cycle 6*^G oO 3. Analyze 5 ul of reaction product on. a gel. 

Note: The reaction product should be a strong smear of products usually ranging fi-om 0.5-5 kb in size and 
centered around 1.5-2 kb. 

Prepare Biotinylated Hook Reagents : PGR reagents Biotin-14-dGTP (BRL &num;19518-018) Individual 
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dNTP stock solutions (Roche dNTP's #1-969-064) Gene specific template and primers PGR purification kit 
(Roche &num; 1732668 or Qiagen Qiaquick #28106) 1. Make lOx biotin dNTP mix: 150 pi biotin-14-dGTP 
3 jxl 100 mM dATP 3 ^il 100 mM dGTP 3 ^1 100 mM dTTP 1. 5 ^1 100 mM dGTP 2. Make PGR mix: 74 
111 water 10 gl 1 Ox Expand Buffer #1 10 ji lOx biotin dNTP mix (step &num;l) 2 111 Primer &num;l (100 
ng/|Lil) 2 RI Primer #2 (100 ng/, ul) 1 pi template (gene specific) (100 ng/^l) 1 \xl Expand Long polymerase 
mix 3. PGR amplify: Robocycler 95''G 3 minute x 1 cycle 95''G 45 seconds * °G 45 seconds x 30 cycles 
68°G ** minute 68°G 8 minute x 1 cycle 6 00 * Use an annealing temperature appropriate for your primers. 

** Allow 1 minute/ kb of target length. 

4. Glean up the reaction product using a PGR purification kit. Elute in 50 RI 5T. IE or Qiagen's EB buffer 
(lOmMTris pH 8.5). 

5. Gheck 5 \i\ on an agarose gel. 

Note : The product may be slightly larger than expected due to the incorporation of biotin. 

Biopanning Reagents : Streptavidin-conjugated paramagnetic beads (GPG MPG-Streptavidin lOmg/ml 
#MSTR0502) (Dynal Dynabeads M-280 Streptavidin) Sonicated, denatured salmon sperm DNA (heated=to 
95 ""G, 5 min) (Stratagene &num; 201 190) PGR reagents dNTP mix Magnetic particle separator Topo-TA 
cloning kit with ToplOF'comp cells (Invitrogen #K4550-40) High Salt Buffer: 5M NaGI, I OmM EDTA, I 
OmM Tris pH 7.3 1. Make the following reaction mix for each library/hook combination: 5 ug insert DNA 
(PGR amplified lambda DNA) 100 ng Biotinylated hook (100 ng total if using more than one hook) 4. 5 ul 
120x SSG for a 3x final concentration (or High Salt buffer) X 1 1 1 dH20 for a final volume of 30 gl 2. 
Denature by heating to 95 "^G for 10 min. (Robocycler works well for this step). 

3. Hybridize at 70°G for 90 min. (Robocycler) 4. Prepare 100 gl of MPG beads for each sample: Wash 100 
ul beads two times with 1 ml 3x SSG Resuspend in: 50 gl 3x SSG (or High Salt buffer) 10 gel Sonicated, 
denatured salmon sperm DNA (10 mg/ml) to block (or 100 ng total (Do not ice) 5. Add the hybridized 
DNA to the washed and blocked beads. 

6. Incubate at room temp for 30 min, agitating gently in the hybridization oven. 

7. Wash twice at room temp with 1 ml 0. Ix SSG/0.1% SDS, (or high salt buffer) using magnetic particle 
separator. 

8. Wash twice at 42'^G with 1 ml O. Ix SSG/0. 1% SDS (or high salt buffer) for 10 min each, (magnet) 9. 
Wash once at room temp with 1 ml 3x SSG. (magnet) 10. Elute DNA by resuspending the beads in 50 u. I 
dHzO and heating the beads to 70°G for 30 min or 85°Gfor 10 nain. in the hyb oven (or thermomixer at 
SOOrpm). Separate using magnet, and discard the beads. 

11. PGR amphfy 1-5 RI of the panned DNA using the same protocol as Preparing Insert DNA fRo771 the 
Lambda DNel above. 

12. Gheck 5 gel on agarose gel. Note: The reaction product should be a strong smear of products usually 
ranging fi"om 0.5-5 kb in size and centered around 1.5-2 kb. 

13. Glone 1-4 111 into pGR2.1-TopoTA cloning vector. 

14. Transform 2 x 3 ttl into ToplOF'chemically comp cells. Plate each transformation on 2 x 150mm LB- 
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kan plates. Incubate at 30°C overnight. 
(Ideal density is # 3000 colonies per plate). 

Repeat transformation if necessary to get a representative number of colonies per library. 
Archive the Biopanned DNA. 

15. Transfer plates to Hybridization group, along with appropriate templates and a single primer for run off 
PCR 32P-labeling reactions. 

Analysis of Results 1. Filter lifts from plates will be performed, and hybridized to the appropriate probe. 
Resultant films will be given to the Biopanned. 

2. Align films to original colony plates. Colonies corresponding to positive"dots-on- film"should be 
toothpicked, patched onto an LB-Kan plate, and inoculated in 4 ml TB-Kan. For automation, inoculate 1 ml 
TB-kan in a 96-well plate and incubate 18 hrs. at 37°C. 

3. Ovemight cultures are mini-prepped (Biomek if possible). Digest with EcoRI to determine insert size. 

2 il DNA 0.5 ill EcoRI 1 gl 1 Ox EcoRI buffer 6.5 1 dHzO Incubate at 37''C for 1 hr. Check insert size on 
agarose gel. Large insert clones (>500bp) are then PCR confirmed if possible with gene specific 
primers. 

4. Putative positive clones are then sequenced. 

5. Glycerol stocks should be made of all interesting clones (>500bp). 

Example 8: HIGH THROUGHPUT CULTIVATION OF MARINE MICROBES FROM SEA SAMPLE 
The following example describes an exemplary method for culturing microenvironments in a containment 
device. 

Preparation of cell suspension [000389] Cells were obtained after filtering 1 10 L of surface water through a 
0. 22, um membrane. The cell pellet was then resuspended with seawater and a volume of 100 L was used 
for cell encapsulation. This provided cell numbers of approximately 107 cells per mL. 

Cell encapsulation into GMDs [000390] The following reagents were used: CelMix Emulsion Matrix and 
Cell Encapsulation Matrix (One Cell Systems, Inc. , Cambridge, MA), Pluronic F-68 solution and 
Dulbecco's Phosphate Buffered Saline (PBS, without Ca2+ and Mg2+). Scintillation vials each containing 
15 ml of CelMixTM emulsion matrix were placed in a 40^C water bath and were equilibrated to 40°C for a 
minimum of 30 minutes. 30 ul of Pluronic Solution F-6.8 (10%) was added to each of 6 vials of melted 
CelGel agarose. The agarose mixture was incubated to 40°C for a minimum of 3 minutes. 100 ul of cells 
(resuspended in PBS) were added per 6 vials of the Cell bottles and the resulting mixture was incubated at 
40°C for 3 minutes. Using a 1 ml pipette and avoiding air bubbles, the CelGelTM-cell mixture was added 
dropwise to the warmed CelMix in the scintillation vial. This mixture was then emulsified using the 
CellSyslOOTM MicroDrop maker as follows: 2200 rpm for 1 minute at room temperature (RT), then 2200 
rpm for 1 minute on ice, then 1 100 rpm for 6 minutes on ice, resulting in an encapsulation mixture 
comprised of microdrops that were approximately 10-20 microns in diameter. The encapsulation mixture 
was then divided into two 15 ml conical tubes and in each vial, the emulsion was overlayed with 5 ml of 
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PBS. The vials tubes were then centrifuged at 1800 rpm in a bench top centrifixge for 10 minutes at RT, 
resulting in a visible Gel MicroDrop (GMD) pellet. The oil phase was then removed with a pipette and 
disposed of in an oil waste container. The remaining aqueous supematant was aspirated and each pellet was 
resuspended in 2 ml of PBS. Each resuspended pellet was then overlayed with 10 ml of PBS. The GMD 
suspension was then centrifuged at 1500 rpm for 5 minutes at RT. Overlaying process is repeated and the 
GMD suspension is centrifuged again to remove all free-living bacteria. The supematant was then removed 
and the pellet was resuspended in 1 ml of seawater. 10 ul of the GMD suspension was then examined under 
the microscope in order to check for uniform GMD size and containment of then encapsulated organism 
into the GMD. This protocol resulted in 1 to 4 cells encapsulated in each GMD. 

Sorting of GMDs containing single cells for identification by 16S rRNA gene sequence [000391] On the 
first day of cultivation we sorted occupied GMDs that contained one to 4 cells, although most had only 
single cells. The sorting was done in a Mo-Flo instrument (Cytomation) by staining the cells inside the 
GMDs with Syto9 and then selecting green fluorescence (from the stain) and side-scatter as parameters for 
sorting gates. The staining was necessary since the cells are much smaller than E. coli and therefore show 
very low light-scatter signals. The target GMDs were sorted into a 96-well plate containing a PGR mixture 
and ready to be amplified immediately after sorting. We used a Hotstart enzyme (Qiagen) such as no 
reaction would occur before boiling for 15 min and therefore allows to work at room temperature before 
amplification. Before starting the PGR it was necessary to radiate the PGR mixture with a Stratalinker 
(Stratagene) at full power for 14 min to cross-link any potential genomic DNA present in the mixture 
before sorting. The primers used include the pair 27F and 1392R and 27F and 1522R according to the 
positions in E. coli gene sequence. The primers were obtained from IDT-DNA Technologies and were 
purified by HPLC. The primer concentration used in the reactions was 0. 2 uM. We used a"touchdown" 
program consisting of 3 stages: a) boiling 15 min, b) 15 cycles decreasing the annealing temperature from 
62 to 55°C by 0.5 degrees per cycle, c) a series of cycles (20-40) increasing the annealing time 1 sec per 
cycle starting with 30 sec but keeping the temperature constant at 55°C. All the other stages of the PGR 
were as recommended by manufacturer. This protocol allowed the amplification of the 16S rRNA gene 
from individual cells encapsulated or small consortia of cells. The PGR products were then cloned into 
TOPO-TA (Invitrogen) cloning vectors and sequenced by dye-termination cycle sequencing (Perkin-Ehner 
ABI). 

Cell growth of encapsulated cells inside GMDs [000392] The encapsulated GMDs were placed into 
chromatography columns that allowed the flow of culture media providing nutrients for growth and also 
washed out waste products from cells. The experiment consisted of 4 treatments including the use of 
seawater, and amendments (inorganic nutrients including trace metals and vitamins, amino acids including 
trace metals and vitamins, and diluted rich organic marine media). This different set of nutrients provided a 
gradient to bias different microbial populations. The seawater used as base for the media was filter 
sterilized through a 1000 kDa and a 0.22 pm filter membranes prior to amendment and introduction to the 
columns. The cells were then incubated for a period of 17 weeks and cell growth was monitored by phase 
contrast microscopy. Cell identification was done by 16S rRNA gene sequence of grown colonies. 

Sorting of GMDs containing colonies consisting of one or more cell types [000393] To identify the 
diversity and the community composition of the different treatments we performed a"bulk sorting"of the 
GMDs. This was done by taking a subsample of the GMDs from each column and run them into the Flow- 
cytometer. We selected as gating criteria forward-and side-scatter as occupied GMDs with a colony of 10 
or more cells of individual cell sizes ranging frorn 0.5 to 5 gm were easy to discriminate from empty 
GMDs. 

We verified each time by phase contrast microscopy that we selected the correct gate for sorting. We then 
sorted a total of 300 GMDs per each individual PGR reaction (prepared as above) and ran the reaction in a 
thermocycler for a total of 50 to 60 cycles to have enough PGR product to be visualized by gel 



file://C:\My %20Documents\WIPO\WO.05-0 1 0 1 69.html 



9/14/06 



Page 82 of 123 



electrophoresis. The resulting PCR reactions from the same column were combined (2 to 4 replicates), 
cloned and sequenced as above to assess the phylogenetic diversity from each column and observe the bias 
effect resulting from the use of different nutrient regimes. 

Gene sequencing and phylogenetic analyses [000394] The gene sequences were aligned and compared to 
our 16S rRNA database with the ARB phylogenetic program. Maximum Parsimony and neighbor joining 
trees were constructed using the amplified gene sequences (approximately 1400 bp). 

Example 9: Microextraction Procedure [000395] A single copy of Strepto7nyces containing clones from a 
mixed population are FACS-sorted onto agar, allowed to develop into individual colonies, and bioassayed 
as individual clones. 

Construction of a clone expressing a bioactive metabolite [000396] A genomic library of StreptomyceS 
7nurayamae71sis is constructed in pJ0436 (Bierman et al.. Gene 1991 1 16: 43-49) vector and hybridized 
with probes for polyketide synthase. A clone (IB) which hybridized was chosen and shuttled into 
Streptofriyces venezuelae ATCC 10712 strain. The vector pMF17 was also introduced into S. diversa as a 
negative control. When bioassayed on solid media, clone IB expressed strong bioactivity towards 
Micrococcus luteus demonstrating that the insert present in clone IB encoded a bioactive polyketide , 
molecule. 

FACS-sorting of S. venezuelae clones [000397] The S. venezuelae exconjugant spores containing clone 
IB, as well as pJ0436 vector, are FACS-sorted in 48-well, 96-well, and 384-well format into corresponding 
plates containing MYM agar + Apraniycin 50ug/ml. The single spore clones were allowed to germinate, 
grow and sporulate for 4-5 days. 

[000398] Natural product extraction procedure: After the clones were fully grown and sporulated for 4-5 
days, following volumes of solvent methanol were added to the each well containing the clones. 

48 well format: 0. 8 ml 96 well format: 0.100 ml 384 well format: 0.06 ml The plates were incubated at 
room temperature overnight. 

The next day, the following volumes were recovered from the wells containing the clones. 

48 well format: 0.3 ml 96 well format : 0,060 ml 384 well format: 0.030 ml The extracts were assayed from 
a single well, and after combining extracts from 2,4 and 10 wells. The methanol extract was dried and 
resuspended in 40 ul of methanol : water and 20 ul of which was assayed against M. gluteus as the 
indicator strain. 

A single colony of S. venezuelae containing clone IB produced enough bioactive molecule, in 48-well, 96- 
well as well as 384-well format, to be extracted by the microextraction procedure and to be detected by 
bioassay. 

Example 11: Expression of actinorhodin pathwav in S. venezuelae 10712 [000399] When Sau3A pIJ2303 
library constructed in pJ0436 was introduced into S. venezuelae, one exconjugant which appeared blue- 
grey in color was spotted. This exconjugant showed blue pigment on R2-S agar demonstrating the 
successful expression of a heterologous pathway (actinorhodin) pathway in S. venezuelae. J0436. 

Segregational stability of S. venezuelae 10712 (pJ0436 :: actinorhodin) [000400] Since Streptomyces 
clones for small molecule production are grown in absence of antibiotic selection, it was important to 
determine how stable the S. venezuelae pJ0436 recombinant clones are. The S. venezuelae 10712 
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(pJ0436 :: actinorhodin) clone was used as an example. 

[000401] The act clone was grown in R2-S liquid cultures with and without apramycin and total cell count 
was done by plating on R2-S agar with and without apramycin. The act clone gave 100% and 96% 
apramycin resistant colonies when grown with and without apramycin, respectively. This demonstrates that 
S. venezuelae pJ0436 clones are quite stable segregationally. 

Expression stability of S. venezuelae 10712 (pJ0436 :: actinorhodin) [000402] Expression of the 
actinorhodin gene cluster in S. venezuelae 10712 has been demonstrated. However, when this clone was 
grown in liquid cultures it failed to produce actinorhodin, as determined by the absence of its blue color. 
Nonetheless, when mycelia from such cultures were plated on solid media, actinorhodin producing colonies 
were clearly evident. The majority of the colonies produced a faint blue color while a few colonies 
produced abundant actinorhodin. These colonies which produce actinorhodin abimdantly have been named 
as HBC (hyper blue clones) clones. 

[000403] These observations demonstrate that perhaps in HBC clones, a host mutation has occurred which 
allows very efficient actinorhodin expression. Mutations which could lead to efficient actinorhodin 
expression could include a variety of targets such as, elimination of negative regulators like cutRS, 
overexpression of positive regulators, or efficient expression of pathways which provide precursors for 
actinorhodin. The hyper production of actinorhodin by the HBC clones thus strongly demonstrates that it is 
indeed possible for us to constmct a strain which is more optimized for heterologous expression of small 
molecules, by random mutagenesis or by specific cutRS knockout mutagenesis. 

Construction of a jadomycin blocked mutant of S. venezuelae [000404] Orfl of the jadomycin biosynthetic 
gene cluster was chosen as a target. Primers were designed so as to amplify jad-L and jad-R fragments with 
proper restriction sites for future subcloning. S. venezuelae is reasonably sensitive to hygromycin and 
therefore, hygromycin resistance gene will be used to disrupt the orf-1 gene. The strategy used for 
disrupting the jadomycin orf-1 is described in the attached figure. The hyg-disrupted copy of the orf-1 gene 
will then be placed on pKC 12 i 8 and used for gene replacement in the S. venezuelae 10712, as well as 
VS153 chromosome. 

Expression of the yellow clone in S. venezuelae [000405] The single arm rescue technique to recover the 
yellow clone insert from S. Hvidans clone 525Sm575 was described. The recovered clone #3 was mated 
into S. venezuelae 10712 as well as VS153. Yellow color was evident after several days on both 10712 as 
well as VS153 plates but absent in the pJ0436 vector alone controls. Three 10712 yellow clones were 
grown in liquid R2-S medium and all three produced yellow color profusely. This experiment has validated 
S. venezuelae as a host and pJ0436 as the vector for heterologous expression for the second time, the first 
time being with the actinorhodin gene cluster. This yellow clone insert could now be used in validation of 
different strains in our strain improvement program. 

3. Development of a mating protocol in a microtiter plate format. 

[000406] In order to have the individual E. coli donor clones archived, we are attempting to develop a 
mating protocol in a microtiter plate format. According to this protocol, we plan to sort the E. coli library 
into a 96-well microtiter plate. The matings with S. diversa would then be done in on a R2-S agar plate in 
an array format corresponding to the 96-well microtiter plate containing the E. coli clones. The bioassays 
can be either conducted on the mating R2-S plate or the clones can be first replica plated on to another 
suitable agar plate and then bioassayed. This approach will allow us to go back to the E. coli clones once 
we detect a bioactive clone among the S. diversa exconjugant library. The E.-coli clone can then be mated 
back into S. diversa for re-transformation and confirmation of the bioactivity. 
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[000407] In a preliminary experiment, matings were done by spotting S. diversa spores together with E, 
coli donor cells on R2-S agar plate (rather than spreading). After about 8 hours the plate was overlayed as 
usual with apramycin and nalidixic acid. The exconjugants appeared only on those spots were E. coli donor 
was added, but not on those spots containing S. diversa spores alone. These initial data are very promising, 
although some more standardization needs to be done to develop;-titiis technique fully. 

Example 12: Production of single cells or fragmented myceUa [000408] In order to produce single cells or 
fragmented mycelia, 25ml MYM media was inoculated (see recipe below) in 250 ml baffled flask with 100 
ul of Streptomyces 10712 spore suspension and incubated ovemight at 30°C 250rpm. After a 24 hour 
incubation, 10 ml was transferred to 50ml conical polypropylene centrifuge tube and centrifuged at 4, 
OOOrpm for 10 minutes &commat; 25''C. Supematant was decanted and the pellet was resuspended in 1 
Oml 0. 05M TES buffer. The cells were sorted into MYM agar plates (sort 1 cell per drop, 5 cells per drop, 
10 cells per drop) and we incubated the plates at 30°C. 

[000409] MYM media (Stuttard, 1982, J. Gen. Microbiol. 128: 1 15-121) contains: 4 g maltose, 10 g malt 
ext. , 4 g yeast extract, 20 g agar, pH 7.3, water to 1 L. 

Example 13: An exemplary method for the discovery of novel enzymes [000410] The following describes a 
method for the discovery of novel enzymes requiring large substrates (e. g. , cellulases, amylases, 
xylanases) using the ultra high throughput capacity of the flow cytometer. As these substrates are too large 
to get into a bacterial cell, a strategy other than single intracellular detection must be employed in order to 
use the flow cytometer. For this purpose, we have adapted the gel microdrop (GMD) technology (One Cell 
Systems, Inc. ) Specifically, the enzyme substrate is captured within the GMD and the enzyme allowed to 
hydrolyze the substrate within this microenvironment. However, this method is not limited to any particular 
gel microdrop technology. Any microdrop-forming material that can be derivatized with a capture molecule 
can be used. The basic experimental design is as follows : Encapsulate individual bacteria containing DNA 
libraries within the GMDs and allow the bacteria to grow to a colony size containing hundreds to thousands 
of cells each. The GMDs are made with agarose derivatized with biotin, which is commercially available 
(One Cell Systems). After appropriate colony growth, streptavidin is added to serve as a bridge between a 
biotinylated substrate and the biotin-labeled agarose. Finally, the biotinylated substrate will be added to the 
GMD and captured within the GMD through the biotin-streptavidin-biotin bridge. The bacterial cells will 
be lysed and the enzyme released from the cells. The enzyme will catalyze the hydrolysis of the substrate, 
thereby increasing the fluorescence of the substrate within the GMD. The fluorescent substrate will be 
retained within GMD through the biotin-streptavidin-biotin bridge and thus, will allow isolation of the 
GMD based on fluorescence using the flow cytometer. The entire microdrop will be sorted and the DNA 
from the bacterial colony recovered using PCR techniques. This technique can be applied to the discovery 
of any enzyme that hydrolyzes a substrate with the result of an increased fluorescence. Examples include 
but are not limited to glycosidases, proteases, lipases, feruUic acid esterases, secondary amidases, and the 
like. 

[00041 1] One system uses a biotin capture system to retain secreted antibodies within the GMD. The 
system is designed to isolate hybridomas that secrete high levels of a desired antibody. This basic design is 
to form a biotin-streptavidin-biotin sandwich using the biotinylated agarose,*streptavidin, and a biotinylated 
capture antibody that recognizes the secreted antibody. The" captured" antibody is detected by a 
fluoresceinated reporter antibody. The flow cytometer is then used to isolate the microdrop based on 
increased fluorescence intensity. The potentially unique aspect to the method described here is the use of 
large fluorogenic substrates for the determination of enzyme activity within the GMD. 

Additionally, this example uses bacterial cells containing DNA libraries instead of eukaryotic cells and is 
not confined to secreted proteins as the bacterial cells will be lysed to allow access to the enzymes. 
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[000412] The fluorogenic substrates can be easily tailored to the particular enzyme of interest. Described 
below is a specific example of the chemical synthesis of an esterase substrate. Additionally, two examples 
are given which describe the different possible chemical combinations that can be used to make a wide 
variety of substrates. 

Example of Reaction Sequence Leading to GW-AttacUable Substrate [000413] In the first step, 1-amino- 
1 l-azido-3, 6, 9-trioxaundecane [Reference 3], an asymmetric spacer, is attached to N-hydroxysuccinamide 
ester of 5-carboxyfluorescein (Molecular Probes). After reduction of the azide fiinctional group on the end 
of the attached spacer (step 2), activated biotin (Molecular Probes) is attached to the amine terminus (step 
3), and the sequence is completed by esterification of phenolic groups of the fluorescein moiety (step 4). 
The resulting compound can be used as a substrate in screens for esterase activity. 

Design of GMD-Attachable Fluorogenic Substrates Rl ci) Fluor +„ C3 S R2[noQ414] Fluorcore 
iluorophore structure, capable of forming fluorogenic derivatives, e. g. coumarins, resorufins, xanthenes, 
and others. 

[000415] Spacer-a chemically inert moiety providing connection between biotin moiety and the 
fluorophore. Examples include alkanes and oligoethyleneglycols. The choice of the type and length of the 
spacer will affect synthetic routes to the desired products, physical properties of the products (such as 
solubiUty in various solvents), and the abiUty of biotin to bind to deep pockets in avidin. 

[000416] C 1, C2, C3, C4-connector units, providing covalent links between the core fluorophore structure 
and other moieties. CI and C2 affect the specificity of the substrates towards different enzymes. C3 and C4 
determine stability of the desired product and synthetic routes to it. Examples include ether, amine, amide, 
ester, urea, thiourea, and other moieties. 

[000417] RI and R2-functional groups, attachment of which provides for quenching of fluorescence of the 
fluorophore. These groups determine the specificity of substrates towards different enzymes. Examples 
include straight and branched alkanes, mono-and oligosaccharides, unsaturated hydrocarbons and aromatic 
groups, a. Desigfi of GMD-Attachable Fluorescence Resonance Energy Transfer Substrates Polymer Fluor 
ci S 4 C2 H I Sow Quencher [0004 Is] Fluor-A fluorophore. Examples include acridines, coumarins, 
fluorescein, rhodamine, BODIPY, resorufin, porphyrins, etc. 

1000419] Quencher- A moiety, which is capable of quenching fluorescence of the fluorophore when located 
at a close enough distance. Quencher can be the same moiety as the fluorophore or a different one. 

[000420] Polymer is a moiety, consisting of several blocks, a bond between which can be cleaved by an 
enzyme. Examples include amines, ethers, esters, amides, peptides, and oligosaccharides, [000421] CI and 
C2 are equivalent to C3 and C4 in the previous design. 

[000422] Spacer is equivalent to Spacer in the previous design. 

References: [1] Gray, F, Kenney, J. S. , Dunne, J. F. Secretion capture and report web: use of affinity 
derivatized agarose microdroplets for the selection of hybridoma cells, J Immunol. Meth. 

1995,182, 155-163. 

[2] Powell, K. T. and Weaver, J. C. Gel microdroplets and flow cytometry: Rapid determination of 
antibody secretion by individual cells within a cell population. 
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[3] Schwabacher, A. W.; Lane, J. W.; Schiesher, M. W.; Leigh, K. M. ; Johnson, C. W. J. 
Org.Chem. 1998,63, 1727-1729, 

Example 14: An exemplary ultra high throughput screen: a recombinant approach [000423] This example 
demonstrates an ultra high throughput screen for the discovery of novel anticancer agents. This method 
uses a recombinant approach to the discovery of bioactive molecules. The examples use complex DNA 
libraries from a mixed population of uncultured microorganisms that provide a vast source of natural 
products through recombinant expression from whole gene pathways. The two objectives of this Example 
include: 1) Engineering of mammalian cell lines as reporter cells for cancer targets to be used in ultra-high 
throughput assay system. 

2) Detection of novel anticancer agents using an ultra high throughput F ACS-based screening format. 

[000424] The present invention provides a new paradigm for screening technologies that brings the small 
molecule libraries and target together in a three dimensional ultra high throughput screen using the flow 
cytometer. In this format, it is possible to achieve screening rates of up to 108 per day. The feasibility of 
this system is tested using assays focused on the discovery of novel anti-cancer agents in the areas of signal 
transduction and apoptosis. 

Development of a validated assay should have a profound impact on the rate of discovery of novel lead 
compounds. 

Experimental Design and Methods 1. Development of cell lines [000425] The goal of this example is to 
develop an ultra high throughput screening format that can be used to discover novel chemotherapeutic 
agents active against a range of molecular targets known to be important in cancers. The feasibility of this 
approach will be tested using mammalian cell lines that respond to activation of the epidermal growth 
factor receptor (EGFR) with induction of expression of a reporter protein. The EGFR-responsive cells will 
be brought together with our microbial expression host within a microdrop (see Example 13 and co- 
pending U. S. patent 6,280, 926, and U. S. application Serial No. 

09/894,956, both herein incorporated by reference). These expression hosts will be Streptomyces or E coli 
and will contain libraries derived from a mixed population of organisms, i. e. high molecular weight 
environmental DNA (10-lOOkb fragments) cloned into the appropriate vectors and transferred to the host. 
These large DNA fragments will contain biosynthetic operons which consist of the genes necessary to 
produce a bioactive small molecule. A bioactive molecule from the microbial host will elicit a biological 
response in the mammalian cell which will induce expression of a fluorescent reporter. The entire 
microdrop will be individually sorted on the flow cytometer based on fluorescence and the DNA from the 
host recovered. The mixed population libraries may contain from 104-lOlo clones, including 105, 106, 107, 
108, 109, or any multiple thereof. 

[0004261 An assay based on the EGF receptor was chosen because of its possible role in the pathogenesis 
of several humem cancers. The EGF -mediated signal transduction pathway is very well characterized and 
several inhibitors of the EGF receptor have been found from natural sources (21, 22). The EGFR is one of 
the early oncogenes discovered (erbB) from the avian erythroblastosis retrovirus and due to a deletion of 
nearly all of the extracellular domain, is constitutively active (23). Similar types of mutations have been 
found in 20-30% of cases of glioblastoma multiforme, a major human brain tumor (24). Overexpression of 
EGFR correlates with a poor prognosis in bladder cancer (25), breast cancer (26,27), and glioblastoma 
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multiforme (28). Most of these cancers occur in an EGF-secreting background and demonstrates an 
autocrine growth mechanism in these cancers. Additionally, EGFR is over-expressed in 40-80% of non- 
small cell lung cancers and EGF is overexpressed in half of primary lung cancers, with patient prognosis 
significantly reduced in cases with concurrent expression of EGFR and EGF (29,30). For these reasons, 
inhibitors of the EGF receptor are potentially useful as chemotherapeutic agents for the treatment of these 
cancers. 

[000427] The goal of this experiment is to create mammalian cell lines that serve as reporter cells for 
anticancer agents. HeLa cells endogenously express the EGFR as confirmed by FACS analysis using the 
anti-EGFR antibody, Ab-1 (Calbiochem). In contrast, CHO cells have little or no expression of the EGFR. 
The gene encoding EGFR was obtained from Dr. 

Gordon Gill (University of Califomia, San Diego) and cloned it into the pcDNA3/hygro vector. The 
resulting vector was transfected into CHO cells and stable transformants selected with hygromycin. 
Enrichment of high EGFR-expressing CHO cells was performed through two rounds of FACS sorting 
using the anti-EGFR antibody. For detection of the activated pathway, a parallel approach is being taken 
utilizing both the PathDetect system from Stratagene (San Diego, CA) and the Mercury Profiling system 
from Clontech (San Diego, CA). The Path Detect system has been validated by researchers as a means of 
detecting mitogenic stimuli (31,32), . 

[000428] The EGFR is a tyrosine kinase receptor that functions through the MAP- kinase pathway to 
activate the transcription factor Elk-1 (33). The PathDetect product includes a fusion trans-activator 
plasmid (pFA-EUd) that encodes for expression of a fusion protein containing the activation domain of the 
Elk-1 transcription activator and the DNA binding domain of the yeast GAL4. A second plasmid contains a 
synthetic promoter with five tandem repeats of the yeast GAL4 binding sites that control expression of the 
Photinus pyralis luciferase gene. The luciferase gene was removed and replaced with the gene encoding for 
the destabilized version of the enhanced green fluorescent protein (EGFP) (plasmid designated pFR- 
d2EGFP). The two plasmids were transfected together into the EGFR/CHO and HeLa cells at a ratio of 10: 
1 (pFR-EGFP: pFA-Elkl) and stable transformants selected using the neomycin resistance gene located on 
the pFA-Elkl plasmid. 

Thus, ligand binding to the EGFR will initiate a signal transduction cascade that results in activation of the 
Elkl portion of the fusion protein, allowing the DNA binding domain of the yeast GAL4 to bind to its 
promoter and turn on expression of EGFP. 

[000429] Stimulation in the presence of serum is not surprising as this signal transduction pathway is 
common to most growth factors and it is likely that many growth factors including EGF are present in the 
semm. After 24 hours of significant serum starvation, this response is greatly reduced (Figure 2A). The 
next step will be to selectively stimulate these cells with recombinant EGF (Calbiochem) and isolate the 
highly responsive single clones using the flow cytometer. These clones will be selected by sorting 
simultaneously for high levels of GFP and the EGFR. The EGFR will be detected using an anti-EGFR 
antibody with a secondary antibody labeled with phycoerythrin. This system has the advantage that use of 
the yeast GAL4 promoter in these cells should keep background or spurious induction of EGFP to a 
minimum. 

[000430] The second group of cell lines uses the Mercury Profiling system to assay the same EGFR 
pathway. This system responds to activation of the pathway with an increase in the expression of human 
placental secreted alkaline phosphatase (SEAP). A fluorescent signal will be obtained by the addition of the 
phosphatase substrate ELF-97-phosphate (Molecular Probes), which yields a bright fluorescent precipitate 
upon cleavage. The advantage of this approach over the PathDetect system is the ability to amplify the 
signal through enzyme catalysis for low-level activation of the pathway. This parallel approach will 
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increase the probability of success in finding bioactive compounds. In the Mercury Profiling system, a 
vector containing the cis-acting enhancer element SRE and the TATA box from the thymidine kinase 
promoter is used to drive expression of alkaline phosphatase (pTA-SEAP). 

This system relies on the endogenous transactivators present in the cell, such as Elk-1, to bind the SRE 
element on the vector and drive expression of SEAP upon stimulation of EGFR. 

The pTA^SEAP vector was transfected into the EGFR/CHO and HeLa cells and stable transformants 
selected using neomycin. Again, stimulation of the pathway occurred in the presence of serum factors in 
the media. Upon serum starvation, this response was greatly reduced (Eigure 2B). Single high expressing 
clones will be isolated following stimulation with EGF and sorting using a flow cytometer. 

Development of ultra high throughput FACS assay [000431] A complex mixed population libraries 
(>106 primary clones/library) was generated that provided access to the untapped biodiversity that exist 
in the >99% uncultivable microorganisms. These novel libraries require the development of ultra high 
throughput screening methods to obtain complete coverage of the library. We propose developing an assay 
using the flow cytometer that allows detection of up to 108 clones/day. 

[000432] In this assay format (Figure 1), an expression host (Streptomyces, E. coli) and a mammalian 
reporter cell will be co-encapsulated together within a microdrop. The microdrop holds the cells in close 
proximity to each other and provides a microenvironment that facilitates the exchange of biomolecules 
between the two cell types. The reporter cell will have a fluorescent readout and the entire microdrop will 
be run through the flow cytometer for clonal isolation. The DNA from the genes or pathway of interest will 
subsequently be recovered using in vitro molecular techniques. This assay format will be validated for the 
discovery of both EGFR inhibitors as well as for small molecules that induce apoptosis. With validation of 
this format, we will progress to the ultra high throughput screening phase designed to discover novel 
chemotherapeutic agents active against these important molecular mechanisms underlying tumorigenesis. 

[000433] The feasibility of this approach will be analyzed initially using the engineered cell lines described 
above that respond to activation by EGF with increased expression of a reporter protein (i. e. EGFP or 
alkaline phosphatase). Additionally, this initial study will use an E. coli host that over-expresses human 
EGF as a secreted protein directed to the bacterial periplasm (34). This approach will allow us to validate 
the assay format prior to screening for inhibitors of the EGFR pathway using our E. coli and Streptomyces 
expression libraries. 

For this experiment, the engineered cell lines will be co-encapsulated together with the E. coli host at a 
ratio of one to one. The EGF-expressing bacteria will be allowed to grow and form a colony within the 
microdrop. Due to the vastly higher growth rate of bacteria, a colony of bacteria will form prior to any or 
minimal cell division of the eukaryotic cell. This colony will then provide a significantly increased 
concentration of the bioactive molecule. The bacterial colony will be selectively lysed using the antibiotic 
polymyxin at a concentration that allows cell survival (35). This antibiotic acts to perforate bacterial cell 
walls and should result in the release of EGF from these cells without affecting the eukaryotic cell. In the 
final discovery assays, this lysis treatment should not be necessary as the small molecule products will 
likely be able to freely diffuse out of the cell. The EGF will activate the signal transduction pathway in the 
eukaryotic cell and turn on expression of the reporter protein. 

[000434] The microdrops will be run through the flow cytometer and those microdrops exhibiting an 
increased fluorescence will be sorted. The DNA fi-om the sorted microdrops will be recovered using PGR 
amplification of the insert encoding for EGF. For the reporter cells expressing secreted alkaline 
phosphatase, a couple of additional steps are required to achieve a fluorescent readout. As the enzyme is 
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secreted from the cell, it is possible to prevent the diffusion of the protein from the microdrop by 
selectively capturing it v^ithin the matrix of the microdrop. This can be accomplished by using microdrops 
made with agarose derivatized with biotin. By forming a sandwich with streptavidin and a biotinylated anti- 
alkaline phosphatase antibody, it is possible to capture alkaline phosphatase where it can catalyze the 
conversion of the ELF-97 phosphate substrate within the microdrop (Figure 3A). 

This technique was successftiUy developed by One Cell Systems for the isolation of high expressing 
hybridomas (36,37). In our hands, with the encapsulation of the SEAP expressing cells, we have shown that 
upon addition of the Elf-97 phosphatase substrate, a fluorescent precipitate forms within the microdrop 
(Figure 3B&C). 

[000435] Initial experiments demonstrate the feasibility of co-encapsulating E. coli and mammalian cells 
(e. g. , CHO) within microdrops. Microdrops were formed using 3% agarose dropped in oil and blended at 
2600 rpm. The E. coli and CHO cells were encapsulated at a ratio of 1: 1 (Figure 4A). After 6 hours, the 
suigle bacterial cell grew into a colony containing thousands of cells (Figure 4B). The cells within the 
microdrops were stained with propidium iodide to determine viability and approximately 70-85 % of the 
CHO cells remained viable after 24 hours. Subsequent steps include determining the response of 
encapsulated clonal EGF-responsive mammalian cells to varying concentrations of EGF in the presence 
and absence of EGFR inhibitors such as Tyrphostin A46 or Tyrphostin A48 (Calbiochem). In addition, E. 
coli clones producing high levels of secreted EGF will be isolated using the Quantikine human EGF 
immunoassay (Systems). Finally, these two cell types will be brought together within the microdrop and a 
change in fluorescence of the eukaryotic cell will be analyzed on the flow cytometer in the presence and 
absence of the EGFR inhibitors. 

A positive result in this experiment would be an increase in fluorescence that can be blocked by the EGFR 
inhibitors. 

[000436] The next step will be to mix the EGF-expressing E. coli with non-expressing cells at varying 
ratios from 1: 1,000 to 1: 1,000, 000 to mimic the conditions of an mixed population library discovery 
screen. The bacterial mixtures and the mammalian cells will be co-encapsulated as described above. The 
highly fluorescent microdrops will be individually sorted by the flow cytometer. To confirm a positive hit, 
the DNA will be recovered by PCR amplification using primers directed against the EGF gene. To improve 
the signal to noise ratio, it is likely that it will be necessary to undergo several rounds of enrichment before 
isolation of positive EGF-expressing clones, especially for the higher mixture ratios. 

[000437] In this case, the microdrops will first be sorted in bulk, the microdrop material removed with 
GELase (Epicentre Technologies) and the bacteria allowed to grow. The encapsulation protocol will be 
repeated with fresh eukaryotic cells until a highly enriched population is observed. At this point, single 
microdrops will be isolated and recovery of the EGF-expressing clone confirmed by PCR. With validation 
of this assay, the goal will be to screen for inhibitors of the EGFR using our mixed population libraries 
expressed in optimized E. coli and Streptomyces hosts. This assay will be done in the presence of EGF and 
the assay endpoint will be a decrease in fluorescence. This format is not limited to only EGFR inhibitors as 
any protein within this pathway could be inhibited and would appear positive in this screen. Likewise, this 
screen can also be adapted to the multitude of anti- cancer targets that are known to regulate gene 
expression. In fact, using this present system, with the addition of the appropriate receptors, it would be 
possible to screen for inhibitors of other growth factors such as PDGF and VEGF. 

[000438] If an increase in fluorescence is not observed with co-encapsulation of the EGF-expressing cells 
and the mammalian reporter cell, there could be several reasons. First, it is possible that the EGF diffiises 
out of the cell too quickly to elicit a response. In this case, it will be necessary to modify the microdrops to 
limit diffusion and concentrate the bioactive molecule at the site of the reporter cell. It is also possible that 
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in the specific case of the EGF assay, the cells will not continue to produce EGF after polymyxin treatment 
and thus, the incubation time of the reporter cells with EGF will be minimal. This is unlikely as the 
polymyxin treatment used will be at concentrations well below that which produces decreased cell 
viabihty. However, if EGF is not continually expressed in this system, other permeabilization methods will 
be explored that do not significantly affect cell metabolism, such as the bacteriociri release protein (BRP) 
system (Display Systems Biotech). The BRP opens the inner and outer membranes of E. coli in a controlled 
manner enabling protein release into the culture medium. This system can be used for large-scale protein 
production in a continuous culture and thus should be compatible with cell survival. 

[000439] Apoptosis, or programmed cell death, is the process by which the cell undergoes genetically 
determined death in a predictable and reproducible sequence. This process is associated with distinct 
morphological and biochemical changes that distinguish apoptosis from necrosis. The malfunctioning of 
this essential process can often lead to cancer by allowing cells to proliferate when they should either self- 
destmct or stop dividing. 

Thus, the mechanisms underlying apoptosis are currently under intense scrutiny from the research 
community and the search for agents that induce apoptosis is a very active area of discovery. 

[000440] The present invention provides an assay for the discovery of apoptotic molecules using our ultra 
high throughput encapsulation technology. The source of these small molecules will come from our 
extremely complex mixed population libraries expressed in Streptomyces and E. coli host strains. These 
host strains will be co-encapsulated together with a eukaryotic reporter cell, the small molecule will be 
produced in the bacterial strain, and will act on the mammalian reporter cell which will respond by 
induction of apoptosis. 

Apoptosis will be detected using a fluorescent marker, the entire microdrop sorted using the flow 
cytometer, and the DNA of interest recovered. The feasibility of this assay will be determined using our 
optimized Streptomyces host strain, S. diversa, co-encapsulated with the apoptotic reporter cell derived 
from human T cell leukemia (e. g. , Jurkat cells). The pathway controlling production of the anti-tumor 
antibiotic, bleomycin, will be cloned into S. diversa as the source of an apoptosis-inducing agent. The 
readout for induction of apoptosis in Jurkat cells will be obtained using the fluorescent marker, Alexis 488- 
annexin VTM. 

[000441] The bleomycin group of compounds are anti-tumor antibiotics that are currently being used 
clinically in the treatment of several types of tumors, notably squamous cell carcinomas and malignant 
lymphomas. However, widespread use of bleomycin congeners has been limited due to early drug 
resistance and the pulmonary toxicity that develops concurrent with administration of this drug. Thus, there 
is continuing effort to find novel small molecules with better clinical efficacy and lower toxicity. 
Bleomycin congeners are peptide/polyketide metabolites that function by binding to sequence selective 
regions of DNA and creating single and double stranded DNA breaks. Several in vitro and in vivo assays 
have shown that bleomycin induces apoptosis in eukaryotic cells (43-45). The biosynthetic gene cluster 
encoding for the production of bleomycin has recently been cloned from Streptomyces verticillus and is 
encoded on a contiguous 85 kb fragment (46). We propose to clone this pathway into a BAC vector to use 
as a source of apoptotic agents in eukaryotic cells. A library will be made from the S. verticillus 
ATCC15003 strain and cloned into the BAC vector, pBlumate2. As the sequence for this pathway is 
known, probes will be designed against sequences from the 5*and Spends of the pathway. The library will be 
introduced into E. coli and screened using colony hybridization with the probe generated against one end of 
the pathway. Positive clones will subsequently be screened with the second probe to identify which clone 
contains the entire pathway. Clones containing the complete pathway will be transferred into our optimized 
expression host S. diversa by mating. Expression of bleomycin will be detected using whole cell bioassays 
with Bacillus subtilis. 
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[000442] Jurkat cells are the classic human cell Une used for studies of apoptosis. The fluorescent Alexis 
488 conjugate of annexin V (Molecular Probes) will be used as the marker of apoptosis in these cells. 
Annexin V binds to phosphotidylserine molecules normally located on the intemal portion of the 
membrane in healthy cells. During early apoptosis, this molecule flips to the outer leaf of the membrane 
and can be detected on the cell surface using fluorescent markers such as the annexin V-conjugates, The 
bleomycin-induced apoptotic response in Jurkat cells will initially be characterized by varying both the 
concentrations of the exogenously administered drug and the incubation time with the drug. Alexis 488- 
annexin V will then be add to the cells and the level of fluorescence analyzed on the flow cytometer. 
Necrotic cell death will be determined using propidium iodide and the apoptotic population will be 
normalized to this value. 

[000443] Co-encapsulation of S. di versa with CHO cells within microdrops produced very similar results 
to the E. coli co-encapsulation. S. diversa grew well in the eukaryotic media and the CHO cell survival rate 
was high after 24 hours. In this experiment, the S. diversa clone expressing bleomycin- will be co- 
encapsulated with the Jurkat cell line. S. diversa will be allowed to grow into a colony within the microdrop 
and begin production of bleomycin. The microdrops will be periodically analyzed over time for induction 
of apoptosis using the Alexis 488-annexin V conjugate on the microscope and flow cytometer. 

After noting the time for induction of apoptosis, a mixing experiment similar to that described for the EGF 
experiment will be performed. Bleomycin-expressing and non- expressing cells will be mixed together at 
ratios of 1: 1000 to 1: 1,000, 000. Co-encapsulation of the mixtures with Jurkat cells will be performed and 
the appropriate incubation time maintained. These microdrops will then be stained with Alexis 488-annexin 
V and sorted on the flow cytometer. Confirmation of a positive bleomycin-expressing sorted clone will be 
performed by PCR amplification of a portion of the pathway. Again, it is likely that enrichment of these 
mixtures will be necessary using a few rounds of bulking sorting on the flow cytometer. 

[000444] If no apoptosis is observed in the initial assay, confirmation of bleomycin production will be 
performed by sorting of the encapsulated S. diversa clone into 1536 well plates. After a predetermined 
incubation period, the supematant will be removed and spotted on filter disks for whole cell bioassays 
using the susceptible strain B. subtilis. Use of the 1536 well plates will hopeftilly avoid significant dilution 
of the antibiotic in the media. As cloning of the bleomycin pathway is quite recent, it has not yet been 
heterologously expressed from the complete pathway. However, Du et al demonstrated the heterologous 
bioconversion of the inactive aglycones into active bleomycin congeners by cloning a portion of the 
pathway into a S. lividans host (46). If bleomycin expression is not detectable in our assay, we will employ 
a similar strategy using our host strain S. diversa. If little bleomycin production is detected under these 
conditions, it will be necessary to optimize the culture conditions for S. diversa to induce pathway 
expression within the microdrop. On the other hand, if bleomycin is produced but apoptosis is not 
observed, it is possible that the molecule is diffiising away from the microdrop too quickly and it will be 
necessary to optimize the microdrop technology to concentrate the metabolite at the site of the reporter cell. 

Optimization of S. diversa secondary metabolite expression in microdrops [000445] Induction of pathway 
expression is an issue that is not limited to the bleomycin example. Bioactive small molecules witiiin 
microorganisms are often produced to increase the host's ability to survive and proliferate. These 
compounds are generally thought to be nonessential for growth of the organism and are synthesized with 
the aid of genes involved in intermediary metabolism, hence the name"secondary metabolites. 'Thus, the 
pathways controlling expression of these secondary metabolites are often regulated under non-optimal 
conditions such as stress or nutrient limitation. As our system relies on use of the endogenous promoters 
and regulators, it might be necessary to optimize conditions for maximal pathway expression. 

[000446] There are several methods that can used to optimize for increased pathway expression within the 
microdrops. For easy detection of maximal expression, we will construct a transposon containing a 
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promoter-less GFP. The enhanced GFP optimized for eukaryotes will be used as it has a codon bias for 
high GC organisms. Transposition into a known pathway (e. g. , actinorhodin) will be done in vitro and the 
vector containing the pathway purified. The transposants will be introduced into an E. coli host, screened 
for clones that express GFP, and positive clones isolated on the flow cytometer. With the transfer of the 
promoter-less gene for GFP into the pathway, increased fluorescence within the cells would demonstrate 
transcription of the pathway using the endogenous promoters located within the pathway. This clone will 
be used as a tool for quick detection of upregulation in pathway expression due to changes in the 
experimental conditions. 

[000447] The S. di versa clone containing GFP and the actinorhodin pathway will be encapsulated in the 
microdrops and several different growth conditions will be tested, e. g., conditioned media, nutrient 
limiting media, known inducing factors, varying incubation times, etc. The microdrops will be analyzed 
under the microscope and on the flow cytometer to determine which conditions produce optimal expression 
of the pathway. These conditions will be verified for viability in eukaryotic cells as well. These optimized 
growth conditions will be confirmed using the bleomycin pathway to assess production of the secondary 
metabolite. Additionally, whole cell optimization of S. diversa is ongoing with production of strains that 
are missing different pleiotropic regulators that often negatively impact secondary metabolite production. 
As these strains are developed, they will be analyzed in the microdrops for enhanced pathway expression. 

[000448] The proximity of the two cell types within the microdrop should result in a high concentration of 
the bioactive molecule at the site of the reporting cell. However, if rapid diffusion of the molecule from the 
microdrop prevents detection of the desired signal, it will be necessary to optimize the microdrop protocol 
or develop a new encapsulation technology. Concentration of the molecule at the site of the reporter cell 
could be achieved by a reduction in the microdrop pore size. Pore size reduction can be accomplished by 
one or a combination of the following approaches: [000449] "plugging" the holes with particles of an 
appropriate size, which are held in the pores by non-covalent or covalent interactions ; (ii) cross-linking of 
the microdrop- forming polymer with low molecular weight agents ; (iii) creation of an external shell 
around the microdrop with pores of smaller size than those in the current microdrop. 

[000450] Plugging the pores can be accomplished using polydisperse latexes with particles sized to fit 
within the pores of the microdrop. Latex particles may be modified on their surface such that they are 
attracted to the microdrop-forming polymer. For example, agarose-based microdrops carry a negative 
electrostatic charge on the surface. Thus, amidine-modified polystyrene latex particles (Interfacial 
Dynamics Corporation) will be attracted to the microdrop surface and the latex particles will effectively 
plug the microdrop pores provided that the charge density on the latex particles and the microdrop surface 
is high enough to sustain strong electrostatic bonds. 

[000451] Cross-linking of agarose beads can be achieved by treating them with various reagents according 
to known procedures (47). For our purposes, the cross-linking needs to occur oiily on the surface of 
microdrop. Thus, it may be advantageous to use polymers carrying reactive groups for cross-linking of 
agarose, such that permeation of the cross- linking agent inside the microdrop is prevented. 

[000452] Formation of classical (48) or polymerizable liposomes (49,50) around microdrops would provide 
a shell that could be ad effective barrier even to small molecules. 

A wide variety of precursors for such liposomes as well as methods for their preparation have been reported 
(48-50) and most of them are applicable for our purposes. One of the possible limitations in choice of 
precursors stems from the intended use of microdrops for eventual screening by the flow cytometer. Thus, 
the liposomes should not absorb in the visible part of the spectrum. 
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[000453] It might also be necessary to use alternative methods and materials for preparation of the 
microdrops. Encapsulation of cells in polyacrylamide, alginate, fibrin, and other gel-forming polymers has 
been described (51). Another plausible candidate for encapsulation material is silica gel, which can be 
formed under physiological conditions with the assistance of enzymes (silicateins) (52) or enzyme 
mimetics (53). Additionally, various polymers may be used as the material for microdrop construction. 
Microdrops may be formed either upon polymerization of monomers (i.e. water-soluble acrylates or 
metacrylates) or upon gelation and/or cross-linking of preformed polymers (polyacrylates, 
polymetacrylates, polyvinyl alcohol). Since the formation of microdrops occurs simultaneously with 
encapsulation of living cells, such formation has to proceed under conditions compatible with cell survival. 
Thus, the precursors for microdrops (monomers or non-gelated polymers) should be soluble in aqueous 
media at physiological conditions and capable of the transformation into the microdrop material without 
any significant participation and/or emission of toxic compounds. 

Example 15: Identification of a Novel Bioactivity or Biomolecule of Interest by Mass Spectroscopic 
Screening [000454] An integrated method for the high throughput identification of novel compoimds 
derived from large insert libraries by Liquid Chromatography -Mass Spectrometry was performed as 
described below. 

[000455] A library from a mixed population of organisms was prepared. An extract of the library was 
collected. Extracts from the libraries were either pooled or kept separate.. 

Control extracts, without a bioactivity or biomolecule of interest were also prepared. 

[000456] Rapid chromatography was used with each extract, or combination of extracts to aid the 
ionization of the compound in the spectra. Mass spectra were generated for the natural product expression 
host (e. g. S. venezuelae) and vector alone (e. g. pJ0436) system. 

Mass spectra were also generated for the host cells containing the library extracts, alone or pooled. The 
spectra generated fi-om multiple runs of either the background samples or the library samples were 
combined within each set to create a composite spectra. Composite spectra may be generated by using a 
percentage occurrence of an average intensity of each binned mass per time period or by using multiple 
aligned single mass spectra over a time period. By using a redundant sampling method where each sample 
was measured several times in the presence of other extracts, the novel signals that consistently occurred 
within a sample extract but not within the backgroimd spectra were determined. 

[000457] The host-vector backgroimd spectrum was compared to the mass spectra obtained fi-om large 
insert library clone extracts. Extra peaks observed in the large insert library clone extracts were considered 
as novel compounds and the cultures responsible for the extracts were selected for scale culture so the 
compound can be isolated and identified. 

Novel metabolite identification by mass spectroscopic screening. 

[000458] In integrated method for the high throughput identification of novel compounds derived fi"om 
large insert libraries by LC-MS is described below. Liquid chromatography -mass spectrometry is used to 
determine the background mass spectra of the natural product expression host (e. g. S. diversa DSIO or 
DS4) and vector alone (e. g. pmfl7) system. This host-vector background spectrum is compared to the mass 
spectra obtained from large insert library clone extracts. Extra peaks observed in the large insert library 
clone extracts are considered as novel compounds and the cultures responsible for the extracts are selected 
for scale culture so the compound can be isolated and identified. 
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[000459] In order to create the background and sample spectra, rapid chromatography is used to aid the 
ionization of the compounds in the extract. The spectra generated from multiple runs of either the 
background samples or the library samples are combined within each set to create a composite spectra. 
Composite spectra may be generated by using a percentage occurrence of an average intensity of each 
binned mass per time period or by using multiple aligned single mass spectra over a time period. Using a 
redimdant sampling method where by each sample is measured several times in the presence of other 
extracts the novel signals that consistently occur within a sample extract but not present in the background 
spectra can be determined. The purpose of this invention is to identify novel compoimds produced by 
recombinant genes encoding biosynthetic pathways without relying on the compounds having bioactivity. 
This detection method is expected to be more universal than bioactivity for identifying novel compounds. 

[000460] Currently there is a similar method of examining culture mixtures by LC-MS with long 
chromatographic times (30-60 min) to bring compounds to a fairly high level of purity. This method relies 
on molecular weight searches for de-replication of known compounds. This slow metiiod would also work 
to identify novel compounds in S. diversa libraries however the throughput would be inadequate for the 
number of samples we need to screen. There are a pair of publications describing rapid direct infusion 
analysis of samples to identify fermentation conditions which improve the biosynthetic productivity of 
strains. This method does not identify specific compoxmd, it just correlates greater, more complex 
production with different culture conditions. 

[000461] Shown below are the following: 1. Chromatographic gradient and mass spec conditions HPLC 
and MS setting for Mass Spec Screening. TXT 2. Pooling of samples sheet Sampling Strategy, htm 3. 
Sample flow using average method Mass Spec Screening Flow chart, doc 4. Matlab code for original 
average background Mass Spec Screening Summary6 Matlab code, txt 5. Matlab code under development 
for new single aligned peaks background determination for more accurate data analysis. 

Mass Spec Screening 2nd Data Analysis Program, txt [000462] The method is best practiced with a set of 
control extracts and sample extracts. 

Mixing of the compounds in pools prior to analysis and deconvolution of the mixed extract pools will 
provide high throughput while maintaining the ability to measure each extract several times. 

[000463] A secondary screen may be required to eliminate false positives. 

10 [000464] This method is more specific for identifying potential novel compounds by molecular ion than 
current methods. This method uses a different data analysis strategy than the de-replication methods for the 
identification of specific peaks for new compounds in extracts. Using the molecular ion as a signal to 
collect on this method may be coupled to mass based collection methods for the rapid isolation of 
compounds. 

[000465] Related references: "Rapid Method to Estimate the Presence of Secondary Metabolites in 
Microbial", Higgs, R. E. ; Zahn, et al. , Appl. Environ. Microbiol. 67: 371-376. 

"Use of direct-infusion electrospray mass spectrometry to guide empirical development of improved 
conditions for expression of secondary metabolites from Actinomycetes", Zahn, et al. , Appl. Environ. 
Microbiol. 67: 377-386. 

"A general method for the de-replication of flavonoid glycosides utilizing high performance liquid 
chromatography mass spectrometric analysis. "Constant, et al. , Phytochemical analysis, 1997,8 : 176-180. 
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Method Information Gradient column analysis of crude extracts by positive ion mode. 

1 100 Quatemary Pump 1 Control Column Flow : 1.000 ml/min Stoptime : 4.00 min Posttime : Off Solvents 
Solvent A : 98. 0 % (Water) Solvent B : 0.0 % (MeOH) Solvent C : 2.0 % (AcCN) Solvent D : 0.0 % 
(iPrOH) PressureLimits Minimum Pressure 0 bar Maximum Pressure : 400 bar Auxiliary Maximal Flow 
Ramp : 100. 00 ml/min^2 Primary Channel : Auto Compressibility : 100*10#-6/bar Minimal Stroke : Auto 
Store Parameters Store Ratio A : Yes Store Ratio B : Yes Store Ratio C : Yes Store Ratio D : Yes Store 
Flow : Yes Store Pressure : Yes Agilent 1 100 Contacts Option Contact 1 : Open Contact 2 : Open Contact 
3 : Open Contact 4 : Open Timetable Time Solv. B Solv. C Solv. D Flow Pressure 0.00 0.0 2.0 0.0 1.000 
0.01 0.0 2.0 0.0 0.30 0.0 95.0 0.0 1.50 0.0 95.0 0.0 1.60 0.0 2.0 0.0 4.00 0.0 2.0 0.0 Agilent 1100 Contacts 
Option Timetable Timetable is empty Agilent 1 100 Diode Array Detector 1 Signals Signal Store Signal, 
Bw Reference, Bw [nm] A: Yes 215 4 450100 B: No 254 4 450100 C: No 280 4 450100 D: No 250 16 Off 
E: No 280 16 Off Spectrum Store Spectra : Apex + Baselines Range from : 190 nm Range to : 600 nm 
Range step : 2.00 nm Threshold : 1. 00 mAU Time Stoptime : As pump Posttime : Off Required Lamps UV 
lamp required : Yes Vis lamp required : Yes Autobalance Prerun balancing : Yes Postrun balancing : No 
Margin for negative Absorbance : 100 mAU Peakwidth : > 0.1 min Slit : 4 nm Analog Outputs Zero offset 
ana. out. 1:5% Zero offset ana. out. 2: 5 % Attenuation ana. out. 1: 1000 mAU Attenuation ana. out. 2: 

1000 mAU Mass Spectrometer Detector = = =- — General 

Information Use MSD : Enabled Ionization Mode: APCI Tune File : atunes. tun StopTime : asPump Time 
Filter : Enabled Data Storage : Condensed Peakwidth : 0.15 min Scan Speed Override : Disabled Signals 
[Signal 1] Polarity : Positive Fragmentor Ramp : Disabled Scan Parameters Time | Mass Range |Frag-| 

Gain|Thres-| Step- (min) | Low | High |mentor| EMV | hold | size 1 1 1 1 1 1 

~ 0. 00 1 10.00 1500.00 70 1.0 500 0.15 [Signal 2] Polarity : Positive Fragmentor Ramp : Disabled Scan 

Parameters Time | Mass Range |Frag-| Gain|Thres-| Step- (min) | Low | High |mentor| EMV | hold | size 

- 0.00 110.00 1500.00 110 1.0 500 0.15 [Signal 3] Not Active 

[Signal 4] Not Active Spray Chamber [MSZones] Gas Temp: 350 C maximum 350 C Vaporizer: 375 C 
maximum 500 C DryingGas: 3. 0 1/min maximum 13.0 1/min Neb Pres : 60 psig maximum 60 psig VCap 
(Positive) : 3000 V VCap (Negative): 3000 V Corona (Positive): 4.0 uA Corona (Negative): 15 IIA FIA 

=== ============ SerieFL\ Series in this Method : Disabled Time Setting Time 

between Injections: 1.00 min 

— =======:==============^ ===== ===== === == == Agilent 1 100 

Column Thermostat 1 Temperature settings Left temperature : 35. O^C Right temperature : Same as left 
Enable analysis : When Temp, is within setpoint +/-0. 8°C Store left temperature : Yes Store right 
temperature: No Time Stoptime : As pump Posttime : Off Column Switching Valve : Column 2 Timetable 
is empty During the process create a backgroxmd file by looking for a certain percentage signal occurrence 
per mass unit. Use the Summary, m program to create this background spectra for use later in step 5 below. 
Optional-Pool samples Use attached pooling strategy 1 2 Measure Data Use LC-MS to acquire data 3 
Extract Data Extract mass spectra into, csv file format 4 Identify consistent signals in sample Compare 
same sample runs to each deconvolute pools if sample other, using Summary, m program, bin pooling in 
step 1 was used, frequently /universally occurring signals 5 Determine Unique Peaks in Sample vs. 1 . 
Convert percent occurrence per Background mass into a new sample spectra file. 2. Use Massieve to 
determine unique peaks in all voltages and chromatographic fractions compared to background 3. 
Create'Unique Peaks'file for each voltage, chromatographic peak comparison. Eliminate extra peaks by 
taking Feed'Unique PeakTile for each sample advantage of multiple MS detection back into Summary, m 
program, keep channels and chromatographic conditions, peaks tiiat show up in more then one Mass 
spectrometer channel or chromatographic peak. 7 Short list of novel compound signals clear dir 
CompressCount=l ; TestFileData= [12 34 45 56 67] MasterDir='C : \HPCHEM\1 
\DATA\MS20FEBA\IND4TSr; % User inputed directory containing other directories with files cd 
(MasterDir); MasterDirFiles = dir % Load all files in master directory to one variable. 

TotalFiles = size (MasterDirFiles) Original Files='Original Files' ; X=990099 % Loop to create compressed 
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directory listing containing only directories, for ExtractDir=l : TotalFiles (1, 1) % Look through find 
directories in master directory if MasterDirFiles (ExtractDir). isdir==l % Test each dir item to see if it is a 
directory Is_Original_Files=stromp (MasterDirFiles (ExtractDir). name. Original Files) ; if not (Is Original 
Files) CompressedDirList (CompressCoimt). name = MasterDirFiles (ExtractDir). name; % assign new 
directories. 

CompressCount=CompressCount+l ; % Increment coxmt compressed directories end end end 
CompressCoimt TotalDirectories==size (CompressedDirList) ; CompressCount=l; for CompressCount= 3 : 
TotalDirectories (1,2)% Main loop for moving in and out of directories. 

CurrentDirectory = CompressedDirList (CompressCoimt). name; cd (CurrentDirectory) ; 
FileNameStub=char (pwd) % Loop to replace backslash in directory names to dash so directory names can 
be labels i=0 ; FileNameLength= size (FileNameStub) for i=l : FileNameLength (1,2) if FileNameStub (1, 
i) FileNameStub (1, i)-*-' end end ListOfCsvFiles-dir (**. csv') PrintHistograms=0 ; % 1 means print 
histogram, 0 means no print. 

% Whether they are printed or not the files will be saved. spectra= []; % Clear spectra mass=109. 8 % 
Initial starting mass. 

CutoffPercent=440 ; % Cutoff percent to check if peak is consistently present spectra=dlmread 
(ListOfCsvFiles (1). name); % Loads first item in dir call into spectra sizespectra=size (spect-a); % 
Determines size of first spectra loaded. master= []; d=l ; SignalOne= []; SignalTwo= [] ; endspectra=Q ; 
format compact % Output form for any variables displayed during run. 

BiggestSpectra==0 ; % Initialize the biggest spectra in batch BiggestObsMass=0; % IntitiaLze the Biggest 
Observed mass in any spectra FileNameRoot= (* -Names, csv') ; % Routine to sort filenames into 
alphabetical order-should correspond to chronological order for % individual mass spectra. 

SizeDirList = size (ListOfCsvFiles); forFileNameOrder= 1 : SizeDirList (1,1) DataFileName 
(FileNameOrder, : ) = ListOfCsvFiles (FileNameOrder). name end SortedDataFileName = sorrows 
(DataFileName) % Routine to prepare NameFile. Csv file for writing FileNames=strcat (FileNameStub, 
FileNameRoot); % Create fall filename as a variable. 

NameFile=fopen (FileNames,'a+') % Open file to record filenames used to create master matrix 
NameOut=char ('Mass'); fjprintf (NameFile, NameOut) ; fprintf (NameFile,'\n*) ; % Prints headerline of 
name file % loop to determine largest measured mass and to write filenames in output files % to allow 
matching filenames and columns from directory lists imported into summaryl for testlength=l : SizeDirList 
(1, 1) spectra=dlmread (SortedDataFileName (testlength,:)) ; sizespectra=size (spectra); if sizespectra (1,1) 
>BiggestSpectra B iggestSpectra=sizespectra (1, 1); end if spectra (sizespectra (1, 1), 1) 
>BiggestObsMass B iggestObsMass=spectra (sizespectra (1, 1), 1); end QddCol= ((testlength*2) +1) ; 
EvenCol=testlength*2 ; Name (OddCol) =cellstr CX'); Name (EvenCol) =cellstr (SortedDataFileName 
(testlength,:)) ; NameOut==char (Name (EvenCol)) Spacer=char (Name (OddCol)) fjprintf (NameFile, 
NameOut); fprintf (NameFile, 'W) ; % Writes even rows filenames, with linebreak between, fprintf 
(NameFile, Spacer) ; fjprintf (NameFile, '\n') ; % Writes odd row with the spacer, with a linebreak between, 
end fclose (NameFile); % Close the file witii the file names. 

Name (1) =cellstr ('Mass*); for i=l : (BiggestObsMass-100) % loop to fill master matrix from 100 to high 
mass value master (i, 1) =mass; % fills in the first column of master with mass units mass=mass+l ; end for 
d=l : SizeDirList (1, 1) % loop to bin spectral intensities into master matrix spectra=dlmread 
(SortedDataFileName (d,:)) ; % reads current file in to variable spectra mass=109.8 ; % Re initialize 
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starting point sizemaster=size (master); mcol=d*2; sizespectra=size (spectra); % Print current index and 
current filename being operated on d FileNameStub SortedDataFileName (d) PreviousMass=Q ; 
PreviousIntensity=0 ; MaxColmlntensity (1, mcol) =0 ; % Sets column intensity to zero so a comparison 
can be made. 

MaxColmlntensity (1, mcol+1) =0 ; % Sets column intensity to zero so a comparison can be made, for i= 1 : 
sizemaster (1, 1) % loop that goes through every row of master, adding columns as spectral data is read 
j=l ; endspectra=0 ; while spectra (j, 1) < (mass+l) & endspectra=0 % loop that checks if there is a data 
point at a mass intensity=spectra (j, 2); % Mass signal intensity is in column 2 of Masstab files 
smass=spectra (j, 1) ; % mix value for each mass is in column 1 of Masstab files. 

% InBin = Logical variable to determine if the current mass is in a bin InBin== ( (smass>=masis) & 
(smass < (mass+1)) & (intensity >0)) ; % InSameBin = Logical variable to determine if there is a second 
signal at the same mass as the previous one InSameBin= (PreviousMass>=mass & PreviousMass < 
(mass+1)) & (PreviousIntensity>0) ; if InBin & #InSameBin % see the mass for the first time- 
generates SignalOne master (i, mcol) =spectra (j, 2); if intensity > MaxColmlntensity (1, mcol) % determine 
largest value per column MaxColmlntensity (1, mcol) ^intensity ; % and store it in MaxColmlntensity for 
later use. end end if InSameBin & InBin % see the mass for the second time, master (i, (mcol+1)) =spectra 
(j, 2); % assign mass to master matrix in second signal column if intensity > MaxColmlntensity (1, mcol+l) 
% determine largest value per second signal column MaxColimilntensity (1, mcol+1) =intensity ; % and 
store it in MaxCobnIntensity for later use. end end j j+l ; % this may not be working as I had hoped-should 
be comparing mass units. if]>sizespectra(l,l) % Do not look for more masses once the position in master 
has been reached endspectra=l ; j=j-2 ; if j=0 % prevents j from being set to zero and putting spectra out of 
range j=l ; end end PreviousMass=smass ; PreviousIntensity=intensity ; end mass=mass+l ; end end mass 
OutputRootchar (^-output, csv'); Output_File=strcat(FileNameStub, OutputRoot) ; dlmwrite (Output File, 
master); % Write master matrix to file. sizemaster=size (master); SignalOne (1, 1) =0 ; SignalTwo (1, 1) 
=0 ; Even='Even*; Odd=*Odd' ; SignalOneNormalizedExists=0 ; SignalTwoNormalizedExists=0 ; % Loop 
to sort out the two signals into the SignalOne and SignalTwo matrices. 

% Will also create the relative intensity matrices SignalOnePercent and SignalTwoPercent % so that the 
signals can be analyzed on a relative intensity basis, for d=l : sizemaster(l, 2) % Go through full length of 
the master matrix, d; for i=l : iggestObsMass-lQO) % Go through all the masses, i; Halfd=d/2; master (i, d); 
% Put in the mass labels down the first column of the seperates signal files. 

SignalOne (i, 1) =master (i, 1); SignalTwo (i, 1) =master (i, 1) ; SignalOnePercent (i, 1) =master (i, 1) ; 
SignalTwoPercent (i, 1) =master (i, 1); if Halfd=round (Halfd) % Put the even rows in SignalOne 
Comprsdevend= (d/2) +1 ; SignalOne (i, Comprsd_even_d) =master (i, d); if MaxCobnIntensity (1, dz % 
Determine relative intensities of first signal. 

SignalOnePercent (i, Comprsd_even_d) =master (i, d)/MaxColmIntensity (1, d) * 100 ; 
SignalOneNormalizedExists=l ; % Flag to prevent SignalOnePercent save if empty end % Even end if 
Halfd#=round (Halfd) % Puts the odd rows in SignalTwo Comprsdoddd=round (Halfd) ; % size signal 
2=size (SignalTwo) 9 if d <= sizemaster(l, 2) % prevents out of range in master because of missing signal 
2 column SignalTwo (i, Comprsd odd d) =master (i, d); if ! intensity (1, d)-=0% Determine relative 
intensities of second signal. 

SignalTwoPercent (i, Comprsdoddd) =master (i, d)/MaxColmIntensity (1, d) *100 ; 
SignalTwoNormalizedExists=l ; % Flag to prevent SignalOnePercent save if empty end % Odd end end 
end % i = end % d= SignallRoot=char ('-SignalOne-output. csv'); Signal l File=strcat (FileNameStub, 
SignallRoot) ; dlmwrite (Signal_l_File, SignalOne); % Write first signal data file. 
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Signal2Root==char ('-SignalTwo-output. csv*); Signal_2_File=strcat (FileNameStub, Signal2Root); 
dlmwrite (Signal_2_File, SignalTwo) ; % Write second signal data file, if SignalOneNormalizedExists 
NonnallRoot=char ('-Normal-SignalOne-output. csv') ; Normal_l_File=strcat (FileNameStub, 
NormallRoot) ; dlmwrite (Normal_l_File, SignalOnePercent) ; % Write first signal relative (normalized) 
data file, end if SignalTwoNormalizedExists Normal2Root=char ( -Normal-SignalTwo-output. csV) 
Normal2File=strcat (FileNameStub, Normal2Root); dlmwrite (Normal 2 File, SignalTwoPercent) ; % % 
Write second signal relative (normalized) data file, end % Procedure to create percentage occurrence 
summaries and to send out histograms of backgrounds, size signal l=size (SignalOne) ; size signal 2=size 
(SignalTwo); ZeroPercent=0 ; TwoFivePercent=2. 5; FivePercent=5 ; for row=l : size_signal_l (1, 1) % 
Main loop to create counts at certain fi'equencies. row FileNameStub GreaterThanZero=0 ; % Initialize 
each counter per row. 

GreaterThanTwoFive=0 ; GreaterThanFive=0 ; for colm=2: size_signal_l(l, 2) % colm % Count number 
of times a signal intensity occurs per mass unit, if SignalOnePercent (row, colm) > ZeroPercent 
GreaterThanZero=GreaterThanZero+l ; end if SignalOnePercent (row, colm) > TwoFivePercent 
GreaterThanTwoFive=GreaterThanTwoFive+l ; end if SignalOnePercent (row, colm) > FivePercent 
GreaterThanFive=GreaterThanFive-t-l ; end end % end column for loop % Determine percent times there is 
a signal per mass % First column of Simimary=mass index, % Colunms 2-4 of Summary = percent 
occurence of intensity. 

% Columns 5-7 of Summary = Greater than PercentCutoff Occurrence of signals per run. if 
SignalOneNormalizedExists Summary 1 (row, 1) =master (row, 1); Summaryl (row, 2) =GreaterThanZero/ 
^ (size_signal_l(l,2)-l)* 100 ; Summaryl (row, 3) =GreaterThanTwoFive/(size_signal_l(l,2).l)* 100 ; 
Summaryl (row, 4) =GreaterThanFive/(size_signal_l(l,2-l)* 100 ; TwoColSummary (row, 1) =master 
(row, 1) ; if Summaryl (row, 2) >=CutoffPercent Summaryl (row, 5) =1 ; TwoColSummary (row, 2) 
=1 ; else Summaryl (row, 5) =0 ; TwoColSummary (row, 2) =0. 01 ; end if Summary l(row, 3) 
>=CutoffPercent Summary 1 (row, 6) =1 ; else Summaryl (row, 6) =0 ; end if Summary 1 (row, 4) 
>=CutofEPercent Summary 1 (row, 7) =1 ; else Summary 1 (row, 7) =0 ; end end % of if statement end 
% end row for loop. 

% Routine to write 6 col and 2 col summary file of peak occurrence, if SignalOneNormalizedExists 
SummaryRoot=char ('-SignalOne-Summary. csV); SummaryFile=strcat (FileNameStub, SummaryRoot); 
dlmwrite (SummaryFile, Summary 1) ; TwoColSummaryRoot=char ('-SignalOne-TwoColSummary. csV) ; 
TwoColSummaryFile=strcat (FileNameStub, TwoColSummaryRoot); % Use ^rintf file save method to 
enter zeros into csv files. 

TwoColSummaryFileOpen = fopen (TwoColSummaryFile,'a+') TwoColLength = size (TwoColSummary); 
i=0 ; for i=l : TwoColLength (1, 1) ^rintf (TwoColSummaiyFileOpen,'% f % c % f\r\ TwoColSummary (i, 
1),',^ TwoColSummary (i, 2)) ; end % fprintf (TwoColSiimmaryFileOpen,*W) fclose 
(TwoColSummaryFileOpen) ; % dlmwrite (TwoColSummaryFile, TwoColSummary); end % Create 
histograms showing binning of percentage occurence, in 5 percent divisions, if SignalOneNormalizedExists 
figure (1) ; hist (Summary 1 (:, 2), 20); OverZero='Occurence over 0% ~ '; FigureTitle=char C- 0% 
histogram*) ; TitleWord (1, : ) =cellstr (OverZero) ; TitleWord (2,:) =cellstr (FileNameStub); xlabel 
('Percent Occurrence'); ylabel ('Counts'); title (TitleWord); if PrintHistograms=l print end FileName=strcat 
(FileNameStub, FigureTitle) ; print ('-djpeg','-r200', FileName) figure (2); hist (Summaryl(:, 3), 20); 
OverTwoFive='Occurence over 2.5% intensity FigureTitle=char ('- 2. 5% histogram'); TitleWord (1, : ) 
=cellstr (OverTwoFive) TitleWord (2,:) =cellstr (FileNameStub) ; xlabel ('Percent Occurrence'); ylabel 
('Counts'); title (TitleWord) ; if PrintHistograms==l print end FileName=strcat (FileNameStub, 
FigureTitle) ; print ('-djpeg','-r200', FileName) figure (3); hist (Summaryl ( :, 4), 20); OverFive='Qccurence 
over 5% intensity FigureTitle=char ('- 5% histogram) ; TitleWord (1, : ) =cellstr (OverFive) TitleWord (2,:) 
=cellstr (FileNameStub) ; xlabel ('Percent Occurrence'); ylabel ('Counts'); title (TitleWord); if 
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PrintHistograms==l print end FileName=strcat (FileNameStub, FigureTitle) ; print C-djpeg'/-r200\ 
FileName) % Create bar graphs showing positions observed more than 50% of the time vs mass, figure (4); 
bar (Summaryl ( 1), Summary 1 ( 5)) ; OverZero2=*Greater than 50% occurrence of signal over 0%- ; 
FigureTitle=char 50% - 0% intensity^ ; TitleWord (1, : ) =cellstr (OverZero2) Title Word (2,:) =cellstr 
(FileNameStub) ; xlabel ('Mass'); ylabel (Tercent Occurrence'); title (TitleWord) ; if PrintHistograms==l 
print end FileName=strcat (FileNameStub, FigureTitle) ; print ('-djpeg', '-r200\ FileName) figure (5); bar 
(Summaryl (:, 1), Summary 1(:, 6)); OverTwoFive2='Greater than 50% occurrence of signal over 2. 5% ~ '; 
FigureTitle=char C-50%-2. 5% intensity'); TitleWord (1, : ) =cellstr (OverTwoFive2) TitleWord (2,:) 
=cellstr (FileNameStub); xlabel ('Mass'); ylabel ('Percent Occurrence') ; title (TitleWord) ; if 
PrintHistograms==l print end FileName=strcat (FileNameStub, FigureTitle); print ('-djpeg','-r200', 
FileName) figure (6); bar (Summaryl ( 1), Summaryl ( 7) ) ; OverFive2- Greater than 50% occurrence 
of signal over 5% - FigureTitle=charC- 50% - 5% intensity'); TitleWord (1, : ) =cellstr (OverFive2) 
TitleWord (2,:) =cellstr(FileNameStub) ; xlabel ('Mass'); ylabel ('Percent Occurrence') ; title (TitleWord) ; 
if PrintHistograms==l print end FileName=strcat (FileNameStub, FigureTitle) ; print ('-djpeg','-r200', 
FileName) % Create percent occurrence vs mass bar graph across all masses, figure (7); bar (Summaryl(:, 
1), Summaryl (;, 2) ) ; OverZero3='Percentage occurrence of signal over 0% — '; FigureTitle=char ('- occur 
per mass at 0 percent'); TitleWord (1, : ) =cellstr (OverZeroS) TitleWord (2,:) =cellstr (FileNameStub) ; 
xlabel ('Mass') ; ylabel ('Percent Occurrence') ; title (TitleWord); if PrintHistograms=l print end 
FileName=strcat (FileNameStub, FigureTitle) ; print ('-djpeg','-r200',FileName) figure (8) ; bar (Summaryl 
( :, 1), Summaryl ( 3) ) ; OverTwoFive3='Percentage occurrence of signal over 2. 5% — '; 
FigureTitle=char ('- occur per mass at 2.5 percent'); TitleWord (1, : ) =cellstr (OverTwoFiveS) TitleWord 
(2,0 =cellstr (FileNameStub) ; xlabel ('Mass'); ylabel ('Percent Occurrence'); title (TitleWord); if 
PrintHistograms==l print end FileName=strcat (FileNameStub, FigureTitle) ; print ('-djpeg','-r200', 
FileName) figure (9); bar (Sunmiaryl(;, 1), Summaryl(;, 4) ) ; OverFive3='Percentage occvirrence of signal 
over 5%~ ; FigureTitle=char ('- occur per mass at 5 percent'); TitleWord (1, : ) =cellstr (OverFive3) 
TitleWord (2,:) =cellstr (FileNameStub) ; xlabel ('Mass'); ylabel ('Percent Occurrence'); title (TitleWord) ; 
if PrintHistograms==l print end FileName=strcat (FileNameStub, FigureTitle); print ('-djpeg','-r200', 
FileName) end % of if SignalOneNormalizedExists statement. 

% Retum to matlab directory % cd C:\matlabrl IWork % tods % pwd dlmwrite ('FILE, txt', TestFileData) 
cd.. ; X % prints after while end % Main loop for moving in and out of directories. 

% Alinel. m % % The program determines the average background value looking at the entire peak shape 
of the spectra. 

% Will need another program to take the measured spectra of true samples and compare them to the 
average % values of the average spectra determined here and the see if they fall within a certain percentage 
of the % RMSD values to see if they are correct, clear dir CompressCount=l ; TestFileData= [12 34 45 56 
67] % Test data for file written as test of program-remove later MasterDir='C : \MATLABR1 1 
\work\TestData'; % User inputed directory containing other directories with files cd (MasterDir); 
MasterDirFiles = dir % Load all files in master directory to one variable. 

TotalFiles = size (MasterDirFiles) Original_Files='Original Files'; X=99099 % Value used to show 
completion of loop. 

% Loop to create compressed directory listing containing only directories, for ExtractDir=l : TotalFiles (1, 
1) % Look through find directories in master directory if MasterDirFiles (ExtfactDir). isdir==l % Test each 
dir item to see if it is a directory Is_Original_Files=stromp (MasterDirFiles (ExtractDir). name, 
Original_Files) ; if not (Is_Original_Files) CompressedDirList (CompressCount). name = MasterDirFiles 
(ExtractDir). name; % assign new directories. 
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CompressCount=CompressCoimt-M ; % Increment coirnt compressed directories end end end 
TotalDirectories=size (CompressedDirList); CompressCoimt=l ; for CompressCoimt= 3 : TotalDirectories 
(1,2)% Main loop for moving in and out of directories. 

CurrentDirectory = CompressedDirList (CompressCount). name; cd (CurrentDirectory) ; 
FileNameStub=char (pwd) % Loop to replace backslash in directory names to dash so directory names can 
be labels i=0 ; FileNameLength= size (FileNameStub) for i=l : FileNameLength (1,2) if FileNameStub (1, 
i) ='V FileNameStub (1, i) end end ListOfCsvFiles=dir (^*. csvO Spectra= []; % Clear Spectra 
mass=109.8 % Initial starting mass. 

Spectra=dlmread (ListOfCsvFiles (1). name) ; % Loads first item in dir call into Spectra sizespectra^size 
(Spectra); % Determines size of first Spectra loaded. 

% master^ []; d=l ; SignalOne= []; SignalTwo= []; % Clear master, SignalOne, SignalTwo endspectra=0 ; 
format compact % Output form for any variables displayed during run. 

BiggestSpectra=0 ; % Initialize the biggest spectra in batch BiggestObsMass=0 ; % Intitialze the Biggest 
Observed mass in any spectra FileNameRoot= (*-Names. csv'); % Routine to sort filenames into 
alphabetical order-should correspond to chronological order for % individual mass spectra. 

SizeDirList = size (ListOfCsvFiles); for FileNameOrder = 1 : SizeDirList (1, 1) DataFileName 
(FileNameOrder, : ) = ListOfCsvFiles (FileNameOrder). name end SortedDataFileName = sortrows 
(DataFileName) % Routine to prepare NameFile. Csv file for writing FileNames=strcat (FileNameStub, 
FileNameRoot); % Create full filename as a variable. 

NameFile=fopen (FileNames,'a+') % Open file to record filenames used to create master matrix 
NameOut=char ('Mass'); fprintf (NameFile, NameOut); fprintf (NameFile,'\n') ; % Prints headerline of 
name file % loop to determine largest measured mass and to write filenames in output files % to allow 
matching filenames and colunms fi-om directory lists imported into Aline for testlength=l : SizeDirList (1, 
1) Spectra=dlmread (SortedDataFileName (testlength,:)) ; sizespectra=size (Spectra); if sizespectra (1,1) 
>BiggestSpectra BiggestSpectra=sizespectra (1, 1); end if Spectra (sizespectra (1, 1), 1) 
>BiggestObsMass B iggestObsMass=Spectra (sizespectra ( 1, 1), 1); end OddCol= ((testlength*2) +1) ; 
EvenGol=testlength*2 ; Name (QddCol) =cellstr ('X'); Name (EvenCol) =cellstr (SortedDataFileName 
(testlength,:)) ; NameOut=char (Name (EvenCol)) Spacer=char (Name (OddCol)) fprintf (NameFile, 
NameOut); f^jrintf (NameFile,'\n') ; % Writes even rows filenames, with linebreak between, fprintf 
(NameFile, Spacer); fjprintf (NameFile, '\n') ; % Writes odd row with the spacer, with a linebreak between, 
end fclose (NameFile); % Close the file with the file names. 

Name (1) =cellstr ('Mass'); % loop to fill first column of matrices from 100 to high mass value with the 
mass labels, for i=l : (BiggestObsMass-100) MaxPositionMaster (i, 1) =mass; AverageMaxPos (i, 1) 
=mass ; Trunc AverageMaxPos (i, 1) =mass; MaxPosDifference (i, 1) =mass; MasterMeanShiftedSpectra (i, 
1) = mass; MasterStDevShiftedSpectra (i, 1) =mass ; mass=mass+l ; end % % % % % % % % % % % % 
% % % % % % % % % % MAIN LOOP TO ORGANIZE ROWS OF MASSES FROM DIFFERENT 
FILES .% % % % % % % % % % % % % % % % % % % Main loop to: % 1) Read data row by row into 
master matrix % 2) Determine first maxima of each peak % 3) Determine average max position for each 
mass % 4) Determine amoimt to shift each spectra % 5) Shift each spectra the appropriate amount to align 
the maxima % 6) Determine the mean spectra by averaging intensity at each point. 

% 7) Determine the standard deviation between the measured spectra and the average. 
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% 8) Record the row by row averages and RMSD's into a master matrix for saving to files at the end. for 
MassPosition = 1 : (BiggestObsMass-100) % Loop to open each file and read values into 
MasterMassRowMatrix % Item 1 above for FileNumber = 1 : SizeDirList (1,1) Spectra= [] ; % Clear 
spectra for new values from next file. 

Spectra = dlmread (SortedDataFileName (FileNumber,:)) ; % Read spectra sequentially for 
MasterMassPerRow % Need a line here to test that we are not past the end of tiie file-test at start with 
constant width files. 

SizeCurrentSpectra = size (Spectra); if MassPosition <= SizeCurrentSpectra (1, 1) MasterMassPerRow 
(FileNumber,:) = Spectra (MassPosition, 2: SizeCurrentSpectra (1, 2) ) ; % transfer row to master matrix 
else MasterMassPerRow (FileNumber,:) = 0; end % FileNumber else end % % % % % % % % % % % % 
%%%%%%%%%% May have to insert a routine to generate a zerofilled rectangular maxtrix for later 
manipulations. 

% % % % % % % % % % % % % % % % % % SizeMasterMassPerRow = size (MasterMassPerRow) ; % 
Find position of first maxima in the current files. 

% Item 2 of above for CiurentFile = 1 : SizeMasterMassPerRow (1, 1) % go through rows one by one. 

NoPeak= 1 ; % Set marker for no maxima PosMarker = 2 % Start Current colm position after the mass 
labels. 

% Item 1 from top of loop while NoPeak % loop continues imtil the first max is found in each row YesPeak 
= 0 % Set YesPeak to negative at start of scan. 

CurrentPos Value = MasterMassPerRow (CurrentFile, PosMarker); % set the current position as the center 
value if PosMarker > 2 PreviousPosValue = MasterMassPerRow (CurrentFile, PosMarker- 1) ; % Get 
previous position value during scan, else PreviousPosValue = 0 ; % if at beginning of row let every signal 
start with a zero value end % end if PosMarker >2 if PosMarker = SizeMasterMassPerRow(l, 2) 
NextPosValue = MasterMassPerRow (CurrentFile, PosMarker) % if at end of row set next value to current 
value NoPeak=0 ; % Jump out if at the end of the row. else NextPosValue = MasterMassPerRow 
(CurrentFile, PosMarkeri-l) ; end % End of if PosMarker at end % Determine if these three points describe 
a peak. 

% YesPeak = logical variable to see if CurrentPos is top of peak. 

YesPeak = (PreviousPosValue < CurrentPosValue) & (CurrentPosValue > NextPosValue) ; if YesPeak % 
Record position of maximum in Master MaxPos Ma:trix % Rows are masses; columns are FileNumber 
positions % Offset CurrentFile by 1 b/c first col'm is the mass label. 

MaxPositionMaster(MassPosition,CurrentFile+l) = PosMarker; NoPeak = 0; % Set NoPeak so while loop 
can end and c^ check next row. end % of if YesPeak PosMarker = PosMarker+l ; % Increment Pos 
Marker to next position, if PosMarker > SizeMasterMassPerRow (1,2) NoPeak = 0; end % if PosMarker 
end % While NoPeak. end % CurrentFile for loop % Item 3 -Determine the average position of maxima for 
each mass SumMaxPos=0 ; for Avelndex = 2: (SizeMasterMassPerRow (1, 1) +1) SumMaxPos = 
SumMaxPos+MaxPositionMaster (MassPosition, Avelndex) ; end % for Avelndex TruncAverageMaxPos 
(MassPosition, 2) = fix (SimiMaxPos/SizeMasterMassPerRow (1, 1)) ; % Item 4 from top of the 
MassPosition loop % If a peak is iforward (smaller pos #) of the average maxima then the shift is positive, 
% if the peak is behind the average maxima then the shift is negative, for Avelndex = 2: 
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(SizeMasterMassPerRow (191) +1) <BR> <BR> <BR> <BR> <BR> MaxPosDifference (MassPosition, 
Avelndex) =MaxPositionMaster (MassPosition, Avelndex)- TruncAverageMaxPos (MassPosition, 2); end 
% for Avelndex 2nd time. 

% Determine the largest positive and negative shift that needs to be made % Continuation of item 4. 

SizeMaxPositionMaster=size (MaxPositionMaster); LargestPositiveShift=0 ; LargestNegativeShift=0 ; for 
i= 2: SizeMaxPositionMaster (1, 2) if MaxPosDifference (MassPosition, i) > LargestPositiveShift 
LargestPositive Shift = MaxPosDifference (MassPosition, i) end if MaxPosDifference (MassPosition, i) < 
LargestNegativeShift LargestNegativeShift = MaxPosDifference (MassPosition, i) end end% for i loop. 

% Item 5-Shift the spectra depending on the position of their maxima. 

% Fill the ShiftedSpectra matrix with the appropriately shifted spectra from MasterMassPerRow. 

ShiftedMatrixWidth = LargestPositiveShift+abs (LargestNegativeShift) ^SizeMasterMassPerRow (1, 2); 
ShiftedSpectra = zeros (SizeMasterMassPerRow (1, 1), ShiftedMatrixWidth) ; % zero fill new shifted 
spectra matrix SizeMaxPosDifference= size (MaxPosDifference) ; for Shift = 2: SizeMaxPosDifference (1, 
2); Startlndex= l+LargestPositiveShift-MaxPosDifference (MassPosition, Shift); FinalPosition = 
Startlndex+SizeMasterMassPerRow (1, 2)-l ; FileNumber=Shift-l ; MasterMassIndex 1 ; for Index = 
Startlndex : FinalPosition ShiftedSpectra (FileNumber, Index) =MasterMassPerRow (FileNumber, 
MasterMassIndex) ; MasterMassIndex=MasterMassIndex+l ; end % Index loop end % Shift loop % Item 
6-Create average intensity spectra for each row. 

SizeShiftedSpectra=size (ShiftedSpectra); MeanShiftedSpectra=mean (ShiftedSpectra) ; % Item 7- 
Determine Standard Deviation for each column of aUgned spectra StDevShiftedSpectra=std 
(ShiftedSpectra) ; % Item S-Record the average shifted spectra per mass and the standard dev per position. 

MasterDim = size (ShiftedSpectra); MasterColWidth = MasterSim (1, 2) +1 ; MasterMeanShiftedSpectra 
(MassPosition, 2 : MasterColWidth) =MeanShiftedSpectra (1, : ) ; MasterStDevShiftedSpectra 
(MassPosition, 2: MasterColWidth) = StDevShiftedSpectra (:,:); dhnwrite ('MasterMeanShiftedSpectra. 
CSV', MasterMeanShiftedSpectra); dlmwrite ('MasterStDevShiftedSpectra. csv', 
MasterStDevShiftedSpectra) ; end % MassPosition loop dlmwrite ('FILE, txt', TestFileData) cd.. 

X end % Compress Count Example 16: Plasmid DNA transformation protocol for Pseudomonas a. 
Preparation of electroporation competent cells [000466] hnl of ovemight culture is inoculated into 100ml 
LB, bacteria are incubated in the 30 degrees Celsius shaker until OD 600 reading reaches 0. 5-0, 7. The 
bacteria are harvested by spinning &coiimiat; SOOOrpm for 10 minutes at 4 degrees Celsius. 

[000467] The resulting cell pellet is washed with 100ml ice-cold ddH20, spun &commat; SOOOrpm for 10 
minutes at 4C to collect the cells. The washing is repeated. The cells are then washed with 50ml 10% ice- 
cold glycerol (in ddH20) once and collected by spinning &commat; SOOOrpm for 10 minutes at 4C. The 
bacteria cell is resuspended into 2ml ice-cold 10% glycerol (in ddH20) 50ul or lOOul is ahquotted into each 
of the tubes and stored at-80C. b. Electroporation [000468] lul plasmid DNA is mixed with 50ul competent 
cell and kept on ice for 5 minutes. 

The mixture is transferred to a pre-chilled cuvette (0.2cm gap, Bio-Rad). The DNA is transformed into 
bacteria by electroporation with Bio-Rad machine. (Setting: Volts: 2. 25KV ; time: 5ms ; capacitance: 
25uF). 
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[000469] 300ul SOC medium is added to the cell mixture and bacteria are incubated at 30C shaker for one 
hour. A certain amount of culture is spread on LA plate with antibiotics and the plates were incubated at 30 
degrees Celsius. 

Example 17: Transformation of Yeast Ce ! b by Etectrocoratioa [000470] One day before the experiment, 
10 ml of \PD medium is inoculated with a single yeast colony of the strain to be transformed. It is grown 
ovemight to saturation at 30 degrees Celsius. On the day of competent cell preparation, the total volume of 
yeast ovemight culture is transferred to a 2L baffled flask containing 500 ml YPD medium. The culture is 
grown with vigorous shaking at 30 degrees Celsius to an OD600 Q 0. 8-1. 0. 

[000471] 500 ml of culture is harvested by centrifuging at 4000 x g, 4 degrees Celsius, for 5 minutes in 
autoclaved bottles. The supematant is subsequently discarded. The cell pellet is washed in 250 ml cold 
sterile water. Washing is repeated twice. The supematant is discarded. 

[000472] The pellet is resuspended in 30 ml of ice-cold IM Sorbitol. The suspension is transferred into a 
sterile 50 ml conical tube. The mixture is centrifuged in a GP-8 centrifuge 2000 rpm, 4 degrees Celsius for 
10 min. The supematant is discarded. The pellet is resuspended in 50, ul of ice-cold IM Sorbitol. The final 
volume of resuspended yeast should be 1.0 to 1.5 ml and the final OD600 should be-200. 

[000473] In a sterile, ice-cold 1.5-ml microcentrifuge tube, 40ul concentrated yeast cells are mixed with lug 
of DNA contained in 50 fil. The mixture is transferred to an ice-cold 0.2-cm- gap disposable 
electroporation cuvette and pulsed at 1.5 kV, 25 uF, 200 D. It should be noted that the time constant 
reported by the Gene Pulser will vary from 4.2 to 4.9 msec. Times <4 msec or the presence of a current arc 
(evidenced by a spark and smoke) indicate that the conductance of the yeast/DNA mixture is too high. 

[000474] 400 il ice-cold IM sorbitol is added to the cuvette and the yeast is recovered, with gentle mixing. 
200 III aliquots of the east suspension should be spread directly on sorbitol selection plates. Incubate 3 to 6 
days at 30 degrees Celsius until colonies appear. 
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strands for hybrid stability. Nature Genetics Example 18: An Exemplary Novel High Throughput 
Cultivation Method [000476] The invention provides an exemplary high throughput cultivation method of 
the invention based on the combination of a microenvironment (a single cell encapsulation procedure) with 
flow cytometry. This method enables cells to grow with nutrients that are present at environmental 
concentrations. 

[000477] Seawater was collected from sites located in the Sargasso Sea. Individual cells were concentrated 
from this seawater by tangential flow filtration and encapsulated in gel microdroplets (GMD). Similar 
GMDs have been used previously to grow bacterial2 and for screening purposes. Single encapsulated cells 
(see Methods) were transferred into chromatography columns (referred to henceforth as growth columns). 
Different culture media selective for aerobic, nonphototrophic organisms were pumped through the growth 
colunms containing 10 million GMDs (Figure 24). The pore size of the GMDs allows the free exchange of 
nutrients. The encapsulated microorganisms were able to divide (proliferate) and form microcolonies of 
approximately 20 to 100 cells within the GMDs. Based on their distinctive light scattering signature, these 
microcolonies were detected and separated by flow cytometry at a rate of 5, 000 GMDs per second. The 
increase in forward and side scatter was shown by microscopy to be directly proportional to the size of the 
microcolony grown within the GMD. This property enabled discrimination between unencapsulated single 
cells, empty or singly occupied GMDs, and GMDs containing a microcolony (Figure 25). 

[000478] To determine the optimal growth medium for a broad diversity of organisms, four media were 
tested in the growth columns: Organic rich medium diluted in seawater (marine medium); seawater 
amended with a mixture of amino acids; seawater amended with inorganic nutrients; and sterile filtered 
seawater (Figure 24). After five weeks of incubation, 1200 GMDs, each containing a microcolony, were 
collected by flow cytometry from each of the four growth columns. A 16S rRNA gene clone library was 
generated from each group of 1200 microcolonies and analysed. In diluted marine medium, only four 
bacterial species were identified, belonging to the genera Vibrio, Marinobacter or Cytophaga, all common 
sea water bacteria that have been cultivated previously 3 9. The media containing amino acids or inorganic 
minerals revealed slightly more diversity. Analysis of 50 clones derived from each medium yielded twelve 
different bacterial species from the amino acid supplemented medium, and eleven species from the 
inorganic medium. Filtered seawater alone (taken from the original sampling site) yielded the highest 
biodiversity (39 species out of 50 clones analysed), with many different phylogenetic groups represented. 
These results demonstrated that organisms capable of rapid growth outgrew their more fastidious 
neighbours in the presence of organic rich medium. 

[000479] Growth columns were next inoculated with GMDs again generated from samples obtained from 
the Sargasso Sea, but now using only filtered seawater as growth medium. From each of two growth 
columns, 500 GMDs containing microcolonies were sorted, and the 16$ rRNA genes contained therein 
were amplified by PGR. A 16S rRNA gene library was also constructed from the original environmental 
sample from which the microorganisms were obtained for encapsulation. Most of the environmental 16S 
rRNA sequences derived from this latter sample fell within the nine common bacterioplankton groups. In 
contrast, many of the 150 16S rRNA gene sequences obtained from the microcolonies fell into clades 
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which contain no previously cultivated representatives (see supplementary information). Three of the most 
notable examples, described in more detail below, were clades affihated with the Planctomycetes and 
relatives, the Cytophaga- Flavobacterium-Bacteroides and relatives, and the alpha subclass of 
Proteobacteria (Figure 26). None of these groups were detected within the environmental 16S rRNA gene 
clone library (167 clones analysed). 

[000480] Five microcolony 16S rRNA gene sequences were related to the Planctomycetales, one of the 
main phylogenetic branches of the domain BacteriaS (Figure 26a). Sequencing of cloned rRNA genes from 
marine environments had previously revealed several new, apparently uncultivated phylotypes within the 
Planctomycetales. Many of these new phylotypes fall within a single, highly diverse monophyletic clade 
that, prior to this study, contained no cultivated representatives. The five Planctomycetales-related 
microcolonies identified in this study form two separate lineages within this deep branching 
Planctomycetales clade (Figure 26a). One lineage, represented by sequences GMD21C08, GMD14H10, 
and GMP14H07 (Figure 26a), was most closely related to 16S rRNA gene clone sequences recovered from 
bacteria associated with marine corals (84.9-89. 2% similar) 17. The second lineage, represented by 
GMD16E07 and GMD15D02 (Figure 26a), form a imique line of descent within this clade, and are 
<84% similar to all previously published 16S rRNA gene sequences. 

[000481] Two microcolony 16S rRNA gene sequences fell within the Cytophaga- Flavobacterium- 
Bacteroides and their relatives. These two closely related sequences form a lineage within a cluster of gene 
clone sequences from predominantly marine and hypersaline environments*9-2 1 . This cluster occupies one 
of the deepest phylogenetic branches of the Cytophaga-Flavobacterium-Bacteroides and relatives group ; 
only the Rhodothermus/Salinibacter lineage is deeper20. Within this cluster, the two microcolony gene 
sequences were nearly identical (>99% similar) to environmental 16S rRNA gene clone sequences obtained 
from seawater collected off of the Atlantic coast of the United States (Figure 26b). Analysis of Phase II 
cultures (see later) obtained from these sorted microcolonies (Figure 24) revealed a culture (strain 
GMPJE10E6) with an identical 16S rRNA gene sequence that reached an optical density (OD6oonm) of 0. 
3 (Figure 26d). 

[000482] A cluster of six microcolonies was recovered that was phylogenetically affiliated with a 
previously imcultivated lineage of 16S rRNA gene clone sequences within the alpha subclass of the 
Proteobacteria (Figure 26c). The microcolony sequences formed two subclusters ; one was closely related 
to two 16S rRNA gene clone sequences recovered from marine samples taken from a coral reef (95.1-98. 
6% similar) (GenBank U87483 and US7512) ; the second was moderately related to the same coral reef- 
associated environmental gene clones (87. 9-95. 7% similar). 

[000483] Thus, the application of this novel high throughput cultivation method resulted in the growth and 
isolation of several bacteria representing previously uncultured phylotypes (see supplementary 
information). This reflects the ability of GMDs to permit the simultaneous and non-competitive growth of 
both slow and fast growing microorganisms in media with very low substrate concentrations. The physical 
separation of cells (contained in the GMDs within the growth columns), combined with flow cytometry 
isolation of microcolonies at different times of incubation, enabled the cultivation of a broad range of 
bacteria, and prevented over-growth by the fast growing microorganisms (the"microbial weeds") 9. 

[000484] To test if this novel high throughput cultivation method is applicable to different environments, 
we applied the technology to an alkaline lake sediment (Lake Bogoria, Kenya, data not shown) and to a soil 
sample (Ghana). Microorganisms from the soil sample were separated from the soil matrix, encapsulated 
and incubated in the growth column under aerobic conditions in the dark. Diluted soil extract, obtained 
from the same sample, was used as growth medium. The microcolonies were analysed by 16S rRNA gene 
sequencing. To cater for bacteria with disparate growth rates, microcolonies were separated from the 
growth column by flow cytometry at different time points. 1 6S rRNA gene sequence analysis revealed that 
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many phylogenetically different microorganisms could be cultivated within the GMDs in Phase I (Figure 
24) (see supplementary information). This approach can be extended to many other physiological and 
environmental conditions. For example, it was demonstrated that encapsulated cells of Methanococcus 
thermolithotrophicus can grow and form microcolonies within GMDs when incubated under strictly 
anaerobic conditions. 

[000485] Physiological studies, natural product screening or studies of cell-cell interaction require the 
ability to grow microorganisms to a certain cell mass. Therefore we designed experiments to determine if 
these microcolonies are able to serve as inocula for larger scale microbial cultures (Figure 24, Phase II). 
Encouragingly, earlier microscopic analysis had revealed that encapsulated bacteria could indeed grow out 
of GMDs when provided with a rich supply of nutrients. GMDs were obtained from a soil sample (Ghana), 
as described above. After growth in diluted soil extract medium, microcolonies were sorted into organic 
rich medium (Figure 24, Phase II). A total of 960 GMDs containing microcolonies, each derived from a 
single organism, were sorted into 96 well microtiter plates filled with organic rich medium (1 GMD per 
well). The 960 cultures were analysed for growth by measuring optical densities (OD600nm) After one 
week of incubation, 67% of the cultures showed turbidity above OD 0.1, corresponding to at least 107 cells 
per miUilitre. Cell densities were high enough to permit the detection of anti-fimgal activity among some of 
the cultures (data not shown). To analyse the diversity within these cultures in more detail, 100 randomly 
picked cultures were analysed by 16S rRNA gene sequencing, revealing many different species (see 
supplementary information). The remaining 33% of the cultures that did not grow to measurable densities 
(fewer then 1Q6 cells per miUilitre), showed bacterial growth when assessed microscopically. This is 
consistent with recent reports indicating that certain bacteria do not grow to cell densities greater than 106 
cells per miUilitre". 

[000486] In order to maintain and access microcolonies for physiological studies, we evaluated the 
minimal number of cells required for passaging by re-encapsulation and detection by flow cytometry. Flow 
cytometry analysis of 1000 and lOO-'individually encapsulated cells resulted in the detection of 360 and 15 
microcolonies, respectively. Even when using cultures comprising just 10 bacterial cells, this method 
allowed recovery of, on average, one viable bacterial culture. This experiment demonstrates that it is 
possible to transfer, and therefore maintain, a culture of 100 cells derived directly from a microcolony. 

[000487] GMDs separate microorganisms from each other, while still allowing the free flow of signalling 
molecules between different microcolonies. Therefore, this method might be applicable for the analysis of 
interactions between different organisms under in situ conditions, for example by inserting the encapsulated 
cells back into the envirormient (e. g. the open ocean). The simultaneous encapsulation of more than one 
cell (prokaryotic as well as eukaryotic) into one GMD might also be used to mimic conditions found in 
nature, allowing analysis of cell-cell interactions. Another advantage of this technology is the very sensitive 
detection of growth. This high throughput cultivation method allows the detection of microcolonies 
containing as few as 20 to 100 cells. Nutrient sparse media, such as sea water, were sufficient to support 
growth, and yet their carbon content was low enough to prevent"microbial weeds"from overgrowing slow 
growing microorganisms. We have demonstrated that this technology can be used to culture thus far 
uncultivated microorganisms. The microcolonies obtained can then be used as inocula for ftirther 
cultivation. 

[000488] In combination with rRNA analysis and mixed organism recombinant screening approaches2223, 
this technology will permit a more complete understanding of unexplored microbial communities. It will 
find applications in environmental microbiology, whole cell optimisation, and drug discovery. The 
combination of cultivation with direct DNA amplification from microcolonies will undoubtedly contribute 
to a broader understanding of microbial ecology by linking microbial diversity with metabolic potential. 

Methods Sample collection [000489] Water samples were collected in the Sargasso Sea (3P50*N 64^10'W 
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and 64°30W) at depths of 3m and 300m, For each sample, a volume of 130 1 was concentrated by 

tangential flow filtration. Soil samples were collected from tropical forest (05°56'N 00°03') and chaparral 
(05°55'N 00°03*W) in Ghana and combined in equal amounts. Cells were separated from the soil matrix by 
repeated sheering cycles followed by density gradient centrifugation24. 

Cell encapsulation and growth conditions [000490] Concentrated cell suspensions were used for 
encapsulation. Single occupied gel microdroplets (GMDs) were generated by using a CellSys I QOTM 
microdrop maker (OneCell System) according to the manufacturer's instructions. Encapsulation of single 
cells was monitored by microscopy. The GMDs were dispensed into sterile chromatography columns XK- 
16 (Pharmacia Biotec) containing 25 ml of media. Columns were equipped with two sets of filter 
membranes (0. 1 Um at the inlet of the column and 8 um at the outlet). The filters prevented free-living 
cells contaminating the media reservoir and retained GMDs in the column while allowing free-living cells 
to be washed out. 

[000491] Media were pumped through the column at a flow rate of 13 ml/h. Media used for incubation of 
marine samples were: Sargasso Sea water filter sterilized (SSW) ; SSW amended with NaN03 (4.25 g/1), 
K2HP04 (0. 016 g/1), NH4CI (0. 27 g/1), trace metals and vitamins25 ; SSW amended with amino acids at 
concentrations between 6 to 30 nez and marine medium (R2A, Difco) diluted in SSW (1 : 100, vol/vol). 
Soil extracts were prepared as previously described27 and added to the media at final concentrations of 25 
to 40 ml/1 in 0. 85% NaCI (vol/vol). GMDs were incubated in the columns for a period of at least 5 weeks. 

Microcolonies that were sorted individually into 96 well microtitre plates were grown with marine medium 
(R2A, Difco) in SSW or with soil extracts amended with glucose, peptone, and yeast extract (1 g/1) and 
humic acids extract 0. 001% (vol/vol). 

Flow cytometry [000492] GMDs containing colonies were separated from free-living cells and empty 
GMDs by using a flow cytometer (MoFlo, Cytomation). Precise sorting was confirmed by microscopy. For 
the re-encapsulation experiment, a series of 1000,100 and 10 Escherichia coli cells (expressing a green 
fluorescent protein, ZsGreen, Clontech), were individually encapsulated and incubated for three hours to 
form microcolonies within the GMDs. GMDs were analysed by flow cytometry and sorted. 

Phylogenetic analysis [000493] Ribosomal RNA genes from environmental samples, microcolonies and 
cultures were amplified by PCR using general oligonucleotide primers (27F and 1392R) for the domain 
Bacteria. To avoid nonspecific amplification, PCR reactions were irradiated with an UV Stratalinker 
(Stratagene) at maximum intensity prior to template addition. After cloning (TOPO-TA, Invitrogen), inserts 
were screened by their restriction pattern obtained with Aval, BamHI, EcoRI, Hindlll, Kpnl, and Xbal. 
Nearly full length 16S rRNA gene sequences were obtained and added to an aligned database of over 12, 
000 homologous 16S rRNA primary structures maintained with the ARB software package28. 
Phylogenetic relationships were evaluated using evolutionary distance, parsimony, and maximum 
likelihood methods, and were tested with a wide range of bacterial phyla as outgroups29. 

Hypervariable regions were masked from the alignment. The phylogenetic trees shown in Figure 26 
demonstrates the most robust relationships observed, and was determined using evolutionary distances 
calculated with the Kimura 2-parameter model for nucleotide chcuige and neighbour-joining. Bootstrap 
proportions from 1000 resamplings were determined using both evolutionary distance and parsimony 
methods. Short reference sequences were added to the phylogenetic trees with the parsimony insertion tool 
of ARB, and are indicated by dotted lines. 

References: 1. Pace, N. R. A molecular view of microbial diversity and the biosphere. Science 276, 734- 
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EXAMPLE 18: FLUORESCENCE in situ HYBRIDIZATION (FISH) IN SOLUTION 1. Materials and 
buffers 5M NaCI IM Tris/HCI, pH 8.0 0. 5M EDTA, pH 8. 0 10% SDS H20bidest Deionized Formamide 
10 mg/mL Lysozyme solution 1.1 Hybridization buffer Formamide x jul (see table) <BR> <BR> SMNaCI 
180 <BR> <BR> <BR> IM Tris/HCI 20 10% SDS 1 RI H20bidest ad 1 ml Composition of 1 mL 
hyridization buffer at different Formamide concentrations % FA 0 5 10 15 20 25 30 35 40 45 50 60 70 @C1 
180 180 180 180 180 180 180 180 180 180 180 180 180 1] Tris 20 20 20 20 20 20 20 20 20 20 20 20 20 
SDS 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 JM FA 0 50 100 150 200 250 300 350 400 450 500 
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600 700 mo 800 750 700 650 600 550 500 450 400 350 300 200 100 [^il] 1.2 Washing buffer 5M 
NaCI X al (see table) IM Tris/HCI 1 ml 10% SDS 50 gel [0. 5M EDTA 500 ^il] H20bidest Composition of 
5 mL washing buffer linked to different Formamide concentrations (in the hybridization buffer) % FA 0 5 
10 15 20 25 30 35 40 45 50 60 70 NaCI 900 630 450 318 215 149 102 70 46 30 18 4 0 @ Tris 100 100 100 

100 100 100 100 100 100 100 100 100 100 [^il] 5555555555555 [^1] EDTA 50 50 50 50 50 

50 50 50 50 [^1] H20 3995 4265 4445 4577 4630 4696 4743 4775 4799 4815 4827 4841 4845 [^l] 2. 
Protocol (hybridization in solution) 2.1 To be prepared 2.1. 1 Prepare probe working solution (100 ng/lU in 
water) from freeze-dried pellets shipped by the manufacturer (IDT) 2.1. 2 Preheat a hybridization oven with 
a rotor to 39°C and a water bath to 41 °C. 

[Hybridizations with single cells are usually carried out at 46/48°C. Since microcapsules (MCs) will 
partially lyse at that temperature, hybridization temperatures need to be reduced by 7°C (equivalent to a 
10% change of the Formamide concentration in the hybridization buffer). Adjust Formamide 
concentrations reported in the literature accordingly (e. g. 35% FA at 46/48°C for probe Bet42a is 
equivalent to 45% FA at 39/41 ""C).] 2.1. 3 Prepare 1 mL hybridization buffer in the appropriate Formamide 
concentration (see table 1) according to probe characteristics (add SDS at the very end). Add 100 probe (s) 
(working solutions) and prew£irm buffer to 39°C. 

2.1. 4 Prepare washing buffer according to the Formamide concentration in the hybridization buffer (see 
table 2) and prewarm to 41 °C. 

2.L 5 Cool DI water in an ice bath to 4''C. 

2.2 Increasing Ethanol series 2.2. 1 Take aliquot (-1 ml) of MCs (fresh or stored at-20''C/50% Glycerol) 
and centrifuge (13,000 rpm, 30 sec). Discard supernatant and resuspend pellet in 1 ml 50% EtOH. 

2.2. 2 Let stand for 3 min and centriftige in a table-top centrifuge (13,000 rpm, 30 sec). 
2.2. 3 Discard supematant and resuspend pellet in 1 ml 80% EtOH. 

2.2. 4 Let stand for 3 min and centrifuge (13,000 rpm, 30 sec). 

2.2. 5 Discard supematant and resuspend pellet in 1 ml 100% EtOH. 

2.2. 6 Let stand for 3 min and centrifuge (13,000 rpm, 30 sec). 

2.2. 7 Let stand for 20 min to evaporate residual EtOH 2.3 Lysozyme treatment 2.3. 1 Resuspend pellet 
containing MCs (from step 2. 2, 7) in 1 mL of a 10 mg/mL Lysozyme solution and incubate for 30 min at 
room temperature while rotating 2.3. 2 Centrifiige at 13,000 rpm (30 sec) 2.3. 3 Resuspend pellet in 1 mL 
DI water 2.3. 4 Centrifuge at 13,000 rpm (30 sec) 2.3. 5 Perform increasing Ethanol series as described in 
step 2.2 2.4 Hybridization and washing 2.4. 1 Resuspend pelleted MCs in 1 mL prewarmed hybridization 
buffer (including probes added). Incubate in a 39°C hybridization oven while rotating (fix tube on rotor 
with tape) for 2 h 2.4. 2 Centrifuge MCs at 13,000 rpm (30 sec) 2.4. 3 Remove hybridization buffer and 
add 1 mL prewarmed washing buffer (Discard hybridization buffer (containing Formamide) in an 
appropriate waste container!) 2.4. 4 Incubate at 4 PC for 15 min. 

2.4. 5 Centrifuge MCs at 13,000 rpm (30 sec) 2.4. 6 Remove washing buffer and add 1 mL ice-cold DI 
water 2.4. 7 Centrifuge MCs at 13,000 rpm (30 sec) 2.4. 8 Remove water and resuspend pelleted MCs in 1 
ml ice-cold DI water 2.4. 9 Hybridized MCs can be stored in the dark at 4°C for a short period of time. 
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2.4. 10 If biotinylated probes and tyramide signal amplification (TSA) with horse-radish peroxidase (HRP) 
are used, resuspend MCs in 2x SSC in step 2.3. 8 instead and continue with the TSA detection protocol 
supplied by the manufacturer (Molecular Probes). 

EXAMPLE 21 DEPLOYING CELLS/MICROCAPSULES ON SLIDE FOR LASER CAPTURE 
MICROSCOPY 1. Using the FACS, individual microcapsules are spotted onto the sHde directly (1-384 per 
slide) 2. Microcapsules may also be diluted in buffer or media. A 20 microliter pipette tip is submerged into 
the diluted microcapsule solution and manually spotted onto the slide by touching the shde briefly with the 
tip. 

Example 22 : Encapsulation protocol The following example describes an exempleiry encapsulation 
protocol for a cell. 

Prepare: 2 water baths (SC'C, 42°C) 1. 5% Agarose solution in Ix PBS or sea water (depending on origin of 
sample to be encapsulated): o add 0. 75 g SeaPlaque Agarose and 0.75 g Ultrapure Agarose (USB Corp.) to 
100 ml Ix PBS o Dissolve by boiling in microwave oven o Aliquot 1 mL each in 1.5 ml Eppendorf tubes o 
keep dissolved in 80°C waterbath 1 scintillation vial filled with 15 mL mineral oil Appropriate cell 
dilutions (check microscopically), dilute cells in appropriate buffers (PBS, sterile sea water, etc) Add 60 gL 
Pluronic Acid to 1 ml of OneCellSystems Agarose Preheat mineral oil to 42''C Transfer melted agarose to 
42 ""C, wait 10 min. to equilibrate temperature Pipet 200 gL of the diluted cells into the agarose, vortex 
Pipet 1 mL of the agarose/cell mixture into 15 mL of preheated mineral oil (avoid bubbles) (up to 2 mL can 
be combined for 15 mL mineral oil) Close scintillation vial and shake Screw scintillation vial into GMD 
maker and run the following protocol 2,900 rpm 2 min RT 2,900 rpm 1 min on ice 1,400 rpm 6 min on ice 
Transfer oil/agarose suspension into 2 15 mL tubes, add 10 mL Ix PBS or sterile sea water Centrifuge (2, 
500 rpm, 10 min) Remove mineral oil and PBS from GMD pellet Resuspend GMDs in PBS or sterile sea 
water and transfer to new tube Centrifuge (2,500 rpm, 5 min) Remove supernatant Resuspend GMDs in 
PBS or sterile sea water and transfer to new tube Centrifuge (2,500 rpm, 5 min) eventually repeat that step 
one more time Resuspend GMDs in PBS or sterile sea water and store at 4°C [000494] While the invention 
has been described in detail with reference to certain preferred aspects thereof, it will be understood by 
those of ordinary skill in the art that modifications and variations are within the spirit and scope of that 
which is described and claimed, and that such modifications and variations may be used with the method of 
the claimed invention. 

WHAT IS CLAIMED IS: LA method for maintaining a cell from a population of cells comprising: (a) 
encapsulating or enclosing within a microenvironment at least a single cell from a population of cells, 
wherein the microenvironment allows exchange of aqueous nutrients from the exterior to the interior of the 
microenvironment; (b) placing the microenvironment comprising the cell into a containment device; and (c) 
incubating the microenvironment in the containment device under conditions allowing the cell to survive 
and be maintained, wherein conditions allowing the maintained cell to survive comprise an exchange of 
aqueous nutrients from the exterior to the interior of the microenvironment, thereby maintaining the cell. 

2. The method of claim 1, wherein the population of cells comprises a mixed population of cells. 

3. The method of claim 2, wherein the mixed population of cells is uncultivated. 

4. The method of claim 1, wherein the population of cells is imcultivated. 

5. The method of claim 2, wherein the population of cells is derived from an environmental sample. 

6. The method of claim 5, wherein the environmental sample is selected from the group consisting of : 



file://C :\My%20Documents\WIPO\WO-05-0 1 0 1 69.html 



9/14/06 



Page 114 of 123 



geothermal fields, hydrothermal fields, acidic soils, sulfotara mud pots, boiling mud pots, pools, hot- 
springs, geysers, marine actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil, temperate 
soil, arid soil, compost piles, manure piles, marine sediments, freshwater sediments, water concentrates, 
hypersaline sea ice, super- cooled sea ice, arctic tundra, Sargasso sea, open ocean pelagic, marine snow, 
microbial mats, whale falls, springs, hydrothermal vents, gut microbial communities, plant endophytes, 
epiphytic water samples, industrial sites and ex situ enrichments. 

7. The method of claim 5, wherein the environmental sample is selected from the group consisting of air, 
water, sediment, soil and rock. 

8. The method of claim 1, wherein the population of cells comprise at least one eukaryote cell, prokaryote 
cell, myxobacteria (epothilone) cell, yeast cell, archaeal cell, plant cell, mammalian cell, insect cell or 
protozoan cell. 

9. The method of claim 1, wherein the population of cells comprises a mixture of materials. 

10. The method of claim 9, wherein the mixture of materials comprises a biological sample, soil or sludge. 

11. The method of claim 10, wherein the biological sample comprises a plant sample, a food sample, a gut 
sample, a salivary sample, a blood sample, a sweat sample, a urine sample, a spinal fluid sample, a tissue 
sample, a vaginal swab, a stool sample, an amniotic fluid sample or a buccal mouthwash sample. 

12. The method of claim 1, wherein the encapsulated or enclosed cell is a microorganism. 

13. The method of claim 12, wherein the microorganism is a bacterial cell, a yeast cell, an archaeal cell, a 
plant cell, a mammalian cell, an insect cell or a protozoan cell. 

14. The method of claim 1, wherein the encapsulated or enclosed cell is an extremophile. 

15. The method of claim 14, wherein the extremophile is selected from the group consisting of 
hyperthermophiles, psychrophiles, halophiles, psychrotrophs, alkalophiles and acidophiles. 

16. The method of claim 1, wherein the microenvironment is designed or manufactured to be porous. 

17. The method of claim 16, wherein aqueous fluids can pass into or out of the microenvironment. 

18. The method of claim 16, wherein nucleic acids, antibodies, hormones or cytokines can pass into or out 
of the microenvironment. 

19. The method of claim 18, wherein plasmids or fosmids can pass into or out of the microenvironment. 

20. The method of claim 16, wherein the microenvironment comprises a porous gel. 

21. The method of claim 20, wherein the porous gel comprises a porous gel microdroplet (GMD). 

22. The method of claim 21, wherein one cell is encapsulated or enclosed in each porous gel microdroplet 
(GMD). 

23. The method of claim 21, wherein the porous gel microdroplet (GMD) comprises an emulsion matrix or 
an encapsulation matrix. 
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24. The method of claim 1, wherein the microenviromnent comprises a hydrogel matrix, a porous 
membrane or a selectively permeable membrane. 

25. The method of claim 1, wherein the microenvironment comprises a microfluidic channel. 

26. The method of claim 1, wherein the microenvironment comprises a liposome or a ghost cell. 

27. The method of claim 1, wherein the microenvironment comprises a (per) fluorinated amorphous 
polymer. 

28. The method of claim 1, wherein the microenvironment cornprises a capillary array. 

29. The method of claim 1, wherein the capillary array comprises a non- addressable capillary array or a 
CTIGAMATRIXTM array. 

30. The method of claim 1, wherein the microenvironment comprises a porous chromatographic membrane 
or a three-dimensional porous structure. 

31. The method of claim 1, wherein two, three, four, five, six, seven, eight, nine, ten or more cells are 
encapsulated or enclosed in each microenvironment. 

32. The method of claim 1, wherein the microenvironment comprises a growth colunm. 

33. The method of claim 32, wherein the growth column comprises a capillary. 

34. The method of claim 33, wherein the capillary comprises a capillary array. 

35. The method of claim 33, wherein the capillary array comprises a non- addressable high throughput 
capillary-based array in a holding plate. 

36. The method of claim 35, wherein said non-addressable capillary-based array in a holding plate is a non- 
addressable capillary array or a GIGAMATRIXTM, 

37. The method of claim 32, wherein the growth column comprises a chromatography column. 

38. The method of claim 1, wherein conditions allowing the maintained cell survive comprise providing 
nutrients at in situ concentrations. 

39. The method of claim 1, wherein conditions allowing the maintained cell survive are equivalent to 
environmental conditions from which the cell was initially derived. 

40. The method of claim 39, wherein the environmental conditions are the equivalent of a geothermal field, 
a hydrothermal field, an acidic soil, a sulfotara mud pot, a boiling mud pot, a pools, a hot-spring, a geyser, 

a tropical soil, a temperate soil, an arid soil, a compost pile, a manure piles, a marine sediment, a freshwater 
sediment, a water concentrate, a hypersaline sea ice, a super-cooled sea ice, an arctic tundra, a fresh water 
environment, a salt water marine environment, an open ocean pelagic environment, a marine snow, a 
microbial mat, a whale fall, a spring or a hydrothermal vent. 

41. The method of claim 1, wherein the aqueous nutrient mixture is flowed through the containment device 
such that the microenvironments are suspended in the containment device, circulated in the containment 
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device or agitated in the containment device. 

42. The method of claim 1, wherein the containment device comprises a system comprising an influx port 
and an efflux port for media. 

43. The method of claim 42, wherein the influx port is positioned at the bottom of the system and the efflux 
port is positioned above the inflxix port. 

44. The method of claim 43, wherein the influx port is positioned at the bottom of a column and the efflux 
port is positioned at the top of the column. 

45. The method of claim 42, wherein the system comprises a collection device for media leaving the efflux 
port. 

46. The method of claim 42, wherein the system recycles media, 

47. The method of claim 46, wherein the system filters waste from the media before recycling the media. 

48. The method of claim 1, further comprising incubating and culturing the cell in the microenvironment 
under conditions allowing growth or proliferation of the cell into a microcolony comprising at least two 
daughter cells. 

49. The method of claim 48, wherein the microcolony comprises between about 4 and 10,4 and 50, 4 and 
100, or 4 and lOQO or more cells. 

50. The method of claim 1, further comprising isolating a microenvironment. 

5 1 . The method of claim 43, further comprising isolating a microenvironment and isolating a cell or a 
microcolony from the microenvironment. 

52. The method of claim 51, further comprising isolating a cell from the microcolony. 

53. The method of claim 50, wherein isolating a microenvironment comprises sorting an encapsulated or 
enclosed microcolony by its size. 

54. The method of claim 50, wherein isolating a microenvironment comprises sorting an encapsulated or 
enclosed microcolony by the number of cells in the microcolony. 

55. The method of claim 50, wherein isolating a microenvironment comprises sorting an encapsulated or 
enclosed microcolony based on whether or not at least one cell in the microcolony expresses a marker. 

56. The method of claim 55, wherein the marker is a nucleic acid, a carbohydrate, a small molecule or a 
protein. 

57. The method of claim 56, wherein the marker is a detectable probe. 

58. The method of claim 57, wherein the detectable probe is a labeled nucleic acid probe or a labeled 
antibody. 

59. The method of claim 58, wherein the labeled nucleic acid probe is detected by FISH, flow cytometry. 
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fluorescence detection, visible light detection, UV light detection or a combination thereof. 

60. The method of claim 51, wherein sorting a microcolony by size or by the number of cells in the 
microcolony comprises using flow cytometry. 

61. The method of claim 60, wherein the flow cytometry is FACS. 

62. The method of claim 52, further comprising maintaining the isolated cell by re-encapsulating or re- 
enclosing the cell in a microenvironment and re-culturing. 

63. The method of claim 62, wherein between about 1 and 16,1 and 100,4 and 64,16 and 200 or more cells 
are maintained in each re-encapsulated microcolony. 

64. The method of claim 1, further comprising screening the interactions between enclosed or encapsulated 
cells. 

65. The method of claim 50, further comprising re-culturing the isolated microenvironment xmder the same 
or different conditions. 

66. The method of claim 1, further comprising direct ampiiflcation of a nucleic acid from an enclosed or 
encapsulated cell. 

67. The method of claim 48, further comprising direct ampiiflcation of a nucleic acid from a cultivated, 
encapsulated cell. 

68. The method of claim 50, wherein isolating a microenvironment comprises use of Microcapsule (MiC) 
in situ hybridization. 

69. The method of claim 50, wherein isolating a microenvironment comprises use of Rolling Circle 
Ampiiflcation (RCA). 

70. The method of claim 50, wherein isolating a microenvironment comprises use of Large Insert FACS 
Biopanning (LIFB) FISH or fluorescence detection. 

71. A method for identifying a polynucleotide encoding an activity of interest comprising (a) encapsulating 
or enclosing in a microenvironment at least a single cell from a mixed population of cells or an uncultivated 
population of cells, wherein the microenvironment allows exchange of aqueous nutrients from the exterior 
to the interior of the microenvironment; (b) placing the encapsulated cell in a containment device; (c) 
incubating the encapsulated or enclosed cell in the microenvironment under conditions allowing the 
encapsulated or enclosed cell to survive and be maintained, wherein conditions allowing the maintained 
cell to survive comprise exchange of aqueous nutrients from the exterior to the interior of the 
microenvironment; (d) contacting a nucleic acid isolated or derived from the encapsulated cell with at least 
one nucleic acid probe comprising a detectable label, wherein the nucleic acid probe is capable of 
specifically hybridizing to a polynucleotide of interest; and (e) detecting a specific hybridization between a 
nucleic acid isolated or derived from the cell and the nucleic acid probe, thereby identifying a 
polynucleotide of interest. 

72. The method of claim 71, fixrther comprising enriching for a polynucleotide encoding an activity of 
interest by isolating or amplifying the nucleic acid identified by the specific hybridization between the 
nucleic acid isolated or derived from the encapsulated cell and the nucleic acid probe. 
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73. The method of claim 71, further comprising use of Large Insert FACS Biopanning (LIFE), FISH or 
fluorescence detection. 

74. A method for identifying or detecting a biomolecule of interest from a population of cells, comprising: 
(a) encapsulating or enclosing at least one cell from a population in a microenvironment, wherein the 
microenvironment allows exchange of aqueous nutrients from the exterior to the interior of the 
microenvironment; (b) placing the microenvironment comprising at least one cell in a containment device; 
(c) incubating the microenvironment in the containment device under conditions allowing the encapsulated 
or enclosed cell to survive and be maintained, wherein conditions allowing the maintained cell to survive 
comprise exchange of aqueous nutrients from the exterior to the interior of the microenvironment; and (d) 
identifying or detecting the biomolecule of interest. 

75. The method of claim 74, ftirther comprising isolating the biomolecule of interest. 

76. The method of claim 74, wherein the incubating step further comprises sufficient time for the enclosed 
or encapsulated cells to grow or proliferate. 

77. The method of claim 74, wherein the population is from a mixed population of cells or a population of 
uncultivated cells. 

78. The method of claim 74, wherein the biomolecule of interest comprises a nucleic acid, a protein, a 
carbohydrate, a lipid or a small molecule. 

79. The method of claim 78, wherein the nucleic acid comprises a genomic DNA (gDNA) or an RNA. 

80. The method of claim 74, wherein the identifying step comprises the steps: a) contacting a nucleic acid 
isolated or derived from the enclosed or encapsulated cell with at least one nucleic acid probe comprising a 
detectable label, wherein the nucleic acid probe is capable of specifically hybridizing to a polynucleotide 
encoding an activity of interest; and b) detecting a specific hybridization between a nucleic acid isolated or 
derived from the encapsulated cell and the nucleic acid probe, thereby identifying a polynucleotide 
encoding an activity of interest. 

81. The method of claim 74, further comprising use of Large Insert FACS Biopanning (LIFB) or FISH or 
fluorescence detection to identify a nucleic acid. 

82. The method of claim 74, wherein a biomolecule of interest is identified by detecting by hybridization 
with a probe having a complementary sequence to a nucleic acid of interest. 

83. The method of claim 82, wherein said detecting is done by FISH. 

84. The method of claim 82, wherein said detecting is done by 16s rRNA hybridization. 

85. The method of claim 74, wherein a biomolecule of interest is identified by detecting by using an 
antibody that specifically binds to the biomolecule of interest. 

86. The method of claim 75, wherein isolating the biomolecule of interest comprises use of flow cytometry. 

87. The method of claim 75, wherein isolating the biomolecule of interest comprises use of fluorescence 
analysis. 
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88. The method of claim 83, wherein the fluorescence analysis is FACS. 

89. The method of claim 75, further comprising isolating a cell comprising the biomolecule of interest. 

90. The method of claim 89, wherein the cell is isolated by laser capture microscopy. 

91. The method of claim 89, further comprising the step of visualizing the morphology of a cell. 

92. The method of claim 91, wherein the visualizing step is done by providing a monolayer of 
microenvironments comprising at least one cell and a substrate to support said monolayer for use with a 
laser capture microscopy system. 

93. The method of claim 92, wherein the visualizing is done by laser capture microdissection device. 

94. The method of claim 90 or claim 92, wherein said laser capture microdissection device is a PALME 
microscope. 

95. The method of claim 89, wherein the cell is identified or isolated by laser catapult. 

96. The method of claim 74, wherein the identified biomolecule of interest comprises a transcript, a gene or 
a gene pathway. 

97. The method of claim 74, further comprising analysis of the biomolecule of interest after identifying the 
biomolecule of interest. 

98. The method of claim 96 or claim 97, wherein a transcript or a gene is amplified before the identifying 
or analyzing step. 

99. The method of claim 98, wherein the transcript or gene is amplified by PCR. 

100. The method of claim 98, wherein the transcript or gene is amplified by rolling circle amplification. 

101. The method of claim 98, further comprising generating a library from the amplified sequences. 

102. The method of claim 101, further comprising sequencing the library. 

103. The method of claim 74, wherein biomolecule of interest comprises a small molecule, a protein, a 
lipid, a metabolite, a secondary metabolite, a carbohydrate or a nucleic acid. 

104. The method of claim 74, further comprising step (e) isolating a microenvironment comprising at least 
one cell. 

105. The method of claim 104, further comprising isolating a biomolecule from a microenvironment. 

106. The method of claim 105, further comprising isolating a cell from the isolated microenvironment, 
followed by isolating a biomolecule from the cell. 

107. The method of claim 105, wherein the biomolecule is secreted by a cell into the microenviroimient. 

108. The method of claim 84, wherein the biomolecule is detected by hybridization, sequencing, enzymatic 

file://C:\My%20Documents\WIPO\WO-05-010169.html 9/14/06 



Page 120 of 123 



reaction or a secondary metabolite assay. 

109. A method for identifying or detecting a molecule of interest from a population of cells, comprising: (a) 
encapsulating or enclosing at least one cell from the population in a microenvironment, wherein the 
microenvironment allows exchange of aqueous nutrients from the exterior to the interior of the 
microenvironment ; (b) placing the microenvironment comprising said at least one cell in a containment 
device, wherein the containment device is fitted or configured such that an aqueous nutrient mixture can be 
flowed or circulated through the containment device such that there is an exchange of aqueous nutrients 
from the Exterior to the interior of the microenvironment, or, aqueous nutrients flow or circulate through 
the interior of the microenvironment ; (c) incubating the microenvironment comprising said at least one cell 
in the containment device under conditions allowing the encapsulated or enclosed cell to survive and be 
maintained, thereby isolating and maintaining the cell, wherein conditions allowing the maintained cell to 
survive comprise exchange of aqueous nutrients from the exterior to the interior of the microenvironment; 
and (d) identifying or detecting a microenvironment comprising the biomolecule of interest. 

1 10. A system for maintaining a cell from a population of cells comprising: (a) a microenvironment 
encapsulating or enclosing at least a single cell from a population of cells, wherein the microenvironment 
allows exchange of fluids, nutrients or growth factors from the exterior to the interior of the 
microenvironment; (b) a containment device capable of incubating the microenvironment under conditions 
allowing the cell to survive and be maintained, wherein the containment device is fitted or configured such 
that an aqueous nutrient mixture can diffuse, flow or circulate through the containment device such that 
aqueous nutrients exchange from the exterior to the interior of the microenvironment, or, flow or circulate 
through the interior of the microenvironment, and the desired conditions can be maintained. 

111. The system of claim 110, wherein the population of cells comprises a mixed population of cells. 

112. The system of claim 111, wherein the mixed population of cells is uncultivated. 

113. The system of claim 1 10, wherein the population of cells is uncultivated. 

114. The system of claim 110, wherein the population of cells is derived from an environmental sample. 

115. The system of claim 114, wherein the environmental sample is selected from the group consisting of : 
geothermal fields, hydrothermal fields, acidic soils, Lsulfotara mud pots, boiling mud pots, pools, hot- 
springs, geysers, marine actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil, temperate 
soil, arid soil, compost piles, manure piles, marine sediments, freshwater sediments, water concentrates, 
hypersaline sea ice, super- cooled sea ice, arctic tundra. Sargasso sea, open ocean pelagic, marine snow, 
microbial mats, whale falls, springs, hydrothermal vents, gut microbial communities, plant endophytes, 
epiphytic water samples, industrial sites and ex situ enrichments. 

116. The system of claim 1 14, wherein the environmental sample is selected from the group consisting of 
air, water, sediment, soil and rock. 

117. The system of claim 1 10, wherein the population of cells comprise at least one bacterial cell, 
eukaryote cell, prokaryote cell, myxobacteria (epothilone) cell, yeast cell^ archaeal cell, plant cell, 
mammalian cell, insect cell or protozoan cell. 

118. The system of claim 1 10, wherein the population of cells comprises a mixture of materials. 

119. The system of claim 110, wherein the mixture of materials comprises a biological sample, soil or 
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sludge. 

120. The system of claim 119, wherein the biological sample comprises a plant sample, a food sample, a 
gut sample, a salivary sample, a blood sample, a sweat sample, a urine sample, a spinal fluid sample, a 
tissue sample, a vaginal swab, a stool sample, an amniotic fluid sample or a buccal mouthwash sample. 

121. The system of claim 110, wherein the encapsulated or enclosed cell is a microorganism. 

122. The system of claim 117, wherein the microorganism is a bacterial cell, a yeast cell, an archaeal cell, a 
plant cell, a mammalian cell, an insect cell or a protozoan cell. 

123. The system of claim 110, wherein the encapsulated or enclosed cell is an extremophile. 

124. The system of claim 123, wherein the extremophile is selected from the group consisting of 
hyperthermophiles, psychrophiles, halophiles, psychrotrophs, alkalophiles and acidophiles. 

125. The system of claim 110, wherein polypeptides can pass into or out of the microenvironment. 

126. The system of claim 110, wherein aqueous fluids can pass into or out of the microenvironment. 

127. The system of claim 110, wherein nucleic acids, antibodies, hormones or cytokines can pass into or 
out of the microenvironment. 

128. The system of claim 110, wherein plasmids or fosmids can pass into or out of the microenvironment. 

129. The system of claim 110, wherein the microenvironment comprises a porous gel. 

130. The system of claim 129, wherein the porous gel comprises a porous gel microdroplet (GMD). 

131. The system of claim 130, wherein one cell is encapsulated or enclosed in each porous gel microdroplet 
(GMD).' 132. The system of claim 130, wherein the porous gel microdroplet (GMD) comprises a 
CELMIXTM emulsion matrix or a CELGELTM encapsulation matrix. 

133. The system of claim 110, wherein the microenvironment comprises a hydrogel matrix, a porous 
membrane or a selectively permeable membrane. 

134. The system of claim 1 10, wherein the microenvironment comprises a microfluidic channel. 

135. The system of claim 110, wherein the microenvironment comprises a liposome or a ghost cell. 

136. The system of claim 110, wherein the microenvironment comprises a (per) fluorinated amorphous 
polymer. 

137. The system of claim 110, wherein the microenvironment comprises a capillary array. 

138. The system of claim 137, wherein the capillary array comprises a GIGAMATRIXTM array. 

139. The system of claim 13"0, wherein the microenvironment comprises a porous chromatographic 
membrane or a three-dimensional porous structure. 
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140. The system of claim 1 10, wherein two, three, four, five, six, seven, eight, nine, ten or more cells are 
encapsulated or enclosed in each microenvironment. 

141. The system of claim 1 10, wherein the microenvironment comprises a growth column. 

142. The system of claim 141, wherein the growth column comprises a capillary. 

143. The system of claim 142, wherein the capillary comprises a capillary array. 

144. The system of claim 143, wherein the capillary array comprises a non- addressable high throughput 
capillary -based array in a holding plate. 

145. The system of claim 144, wherein said non-addressable capillary-based array in a holding plate is 
GIGAMATRIXTM. 

146. The system of claim 141, wherein the growth column comprises a chromatography column. 

147. The system of claim 1 10, wherein the conditions for maintaining the cells comprise providing 
nutrients at in situ concentrations. 

148. The system of claim 1 10, wherein the conditions for maintaining the cells are equivalent to 
environmental conditions fi*om which the cell was initially derived. 

149. The system of claim 148 wherein the environmental conditions are the equivalent of a geothermal 
field, a hydrothermal field, an acidic soil, a sulfotara mud pot, a boiling mud pot, a pools, a hot-spring, a 
geyser, a tropical soil, a temperate soil, an arid soil, a compost pile, a manxire piles, a marine sediment, a 
freshwater sediment, a water concentrate, a hypersaline sea ice, a super-cooled sea ice, an arctic tundra, a 
fresh water environment, a salt water marine environment, an open ocean pelagic environment, a marine 
snow, a microbial mat, a whale fall, a spring or a hydrothermal vent. 

150. The system of claim 1 10, wherein the aqueous nutrient mixture is flowed through the containment 
device such that the microenvironments are suspended in the containment device, circulated in the 
containment device or agitated in the containment device. 

151. The system of claim 110, wherein the containment device comprises a system comprising an influx 
port and an efflux port for media. 

152. The system of claim 151, wherein the influx port is positioned at the bottom of the system and the 
efflux port is positioned above the influx port. 

153. The system of claim 152, wherein the influx port is positioned at the bottom of a column and the 
efflux port is positioned at the top of the column. 

154. The system of claim 151, wherein the system comprises a collection device for aqueous nutrient 
mixture leaving the efflux port. 

155. The system of claim 110, wherein the system recycles aqueous nutrient mixture. 

156. The system of claim 155, wherein the system filters waste from the aqueous nutrient mixture before 
recycling the media. 
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157. A system for maintaining a cell from a population of cells comprising: (a) a plurality of 
microenvironments, wherein each microenvironment encapsulates or encloses at least a single cell from the 
population of cells, and the microenvironment is configured such that a nutrient can pass through, diffuse, 
flow or circulate through its interior ; (b) a containment device capable of incubating the 

- microenvironments mder conditions allowing the cell to sxirvive and be maintained, wherein the 

containment device is fitted or configured such that an aqueous nutrient mixture can be flowed or circulated 
through the containment device such that aqueous nutrients diffuse, flow or circulate through the interior of 
the microenvironments and the desired conditions can be maintained. 

158. The system of claini 1 10 or claim 157, comprising at least two containers in parallel arrangenaent with 
an inlet and outlet flow. 

159. The system of claim 158, comprising at least 5,10, 12,18, 24,36, 96 containers in parallel arrangement 
to receiving media and returning waste. 

160. The system of claim 1 10 or claim 157, wherein at least two or more different species of 
microorganisms are cultured in the system. 

161. The system of claim 160, wherein the at least two different species are in different containers. 

162. The system of claim 1619 wherein the at least two different species are bacteria and fimgi. 

163. The system of claim 110 or claim 157, wherein the system comprises simultaneously maintaining at 
least two, three, four, five or more independently isolated samples from a similar environment. 

164. The system of claim 1 10 or claim 157, wherein the microenvironment comprises a pore size sufficient 
to allow a molecule of at least 1 kilodalton (kD) pass through the pore, or, the microenvironment comprises 
a pore size that does not allow a molecule greater than 1 kilodalton (kD) to pass through the pore. 

165. The system of claim 164, herein the microenvironment comprises a pore size that does not allow a 
molecule greater than about 10,50, 100,150 or 200 kilodalton (kD) to pass through the pore. 

166. The system of claim 1 10 or claim 157, wherein molecules larger than about 200 kD desired to be co- 
incubated with the cells are co-encapsulated with the cells. 
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